OK notification

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
atsb
Posts: 75
Joined: Mon Feb 12, 2018 5:23 am

OK notification

Post by atsb »

Hello!

About nagios XI environment
1) Linux Distribution and version? Centos7
2) 32 or 64bit? 64bit
3) VMware Image or Manual Install of XI? Manual install
4) Are there special configurations on your system, ie; is Gnome installed? Are you using a proxy? Are you using SSL? SSL
5) NagiosXI version: 5.5.7

I have a case when some users get notifications from 6:00-24:00(special timeperiod) and i noticed with one service which went to CRITICAL status at 4:30 only users that did not have this timeperiod defined got service CRITICAL message but when this service went to OK state 6:15 it notified everyone so this means some users are just getting a message that this service is OK, no critical status notification before. Picture of this case:
nagios_notification.png
I want to note out that this service also has defined notification timeperiod which is from 4:30-24:00.
Last edited by atsb on Fri Dec 28, 2018 3:22 am, edited 1 time in total.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: OK notification

Post by lmiltchev »

It is possible that the issue was caused by multiple instances of nagios running on your system. Run the following command and show the output:

Code: Select all

ps -ef | grep nagios.cfg | grep -v grep
Show us the actual service config, and the config of the contact who is getting the recovery notifications but not the critical ones, along with all relevant templates.

Log in as the xi user in question, click on the username in the upper right corner, then click on "Notification Preferences", and post a screenshot of this page.

Post the entire nagios.log (from the day when the issue happened).
Be sure to check out our Knowledgebase for helpful articles and solutions!
atsb
Posts: 75
Joined: Mon Feb 12, 2018 5:23 am

Re: OK notification

Post by atsb »

Hello!

Command output:

Code: Select all

nagios    48442      1  0 Dec14 ?        00:44:57 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios    48503  48442  0 Dec14 ?        00:00:19 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
Pictures of service config
service_config.png
service_config2.png
service_config3.png
atsb
Posts: 75
Joined: Mon Feb 12, 2018 5:23 am

Re: OK notification

Post by atsb »

More pictures, 3 is max at one post.
nagios_timeperiod.png
nagios_contact.png
nagios_timeperiod2.png
atsb
Posts: 75
Joined: Mon Feb 12, 2018 5:23 am

Re: OK notification

Post by atsb »

Preferences from user view:
nagios_pref.PNG
I will send nagios.log via message.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: OK notification

Post by lmiltchev »

The xi users/contacts that received both notifications for your service, all had 24x7 timeperiod (or 00:00-24:00 for each day of the week). XI users/contact that didn't receive the CRITICAL notification, but only received the OK (recovery) had a timeperiod 06:00-24:00.

The Critical event happened at 04:34:59, which is out of the 06:00-24:00 interval. Therefore, notifications were not sent to these contacts (using 06:00-24:00 timeperiod).

The service recovered at 06:15:25, which was within the 06:00-24:00 timeperiod, so notifications were sent.

It seems to me, that nagios notified users as expected (as configured).

You can read more on notifications and notification filters here:
https://assets.nagios.com/downloads/nag ... tions.html
Be sure to check out our Knowledgebase for helpful articles and solutions!
atsb
Posts: 75
Joined: Mon Feb 12, 2018 5:23 am

Re: OK notification

Post by atsb »

How come? shouldn't it notify at 6:00 that service is in critical state? To me sending just OK message without any WARNING/CRITICAL/UNKNOWN state is just plain wrong and confusing. You have to keep in mind that these users are unaware of the actual problem and sending OK without any previous information will just prompt a question "WHAT?".
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: OK notification

Post by lmiltchev »

How come? shouldn't it notify at 6:00 that service is in critical state?
It could, if it were configured to do so... In your service definition, you have:

Code: Select all

notification_interval      0
which means that only one notification will be sent, and as expected, only one notification was sent at 04:34:59.

As per our official documentation:
Service - notification interval

This directive is used to define the number of "time units" to wait before re-notifying a contact that this service is still in a non-OK state. Unless you've changed the interval_length directive from the default value of 60, this number will mean minutes. If you set this value to 0, Nagios will not re-notify contacts about problems for this service - only one problem notification will be sent out.

Parameter name: notification_interval
Required: yes
Be sure to check out our Knowledgebase for helpful articles and solutions!
atsb
Posts: 75
Joined: Mon Feb 12, 2018 5:23 am

Re: OK notification

Post by atsb »

In this case it will keep spamming other contacts about that same existing problem and by that logic that interval has to be set to very low. One notification is enough to everyone to understand that there is something wrong with this service/host.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: OK notification

Post by lmiltchev »

Again, nagios is a very flexible product. There are many different options - it is entirely up to you to configure it in a way that makes sense for your specific environment/needs. In this particular case, nagios is doing exactly what it is configured to do. Let us know if you have any more questions. Thank you!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked