Page 1 of 2

Nagios alerts when host has SERVICE DOWNTIME?

Posted: Tue Jan 26, 2016 5:58 am
by julian924s
Hi - hoping someone can answer this for me as it's got me puzzled. Running Nagios core 4.0.8 on SuSE Linux and have a weird scenario where a service on a host that was in SERVICE DOWNTIME still sent an alert out - how can this be? A bit of background info for you.

The SERVICE DOWNTIME was scheduled for 23:10 -- 23:20 on the service check called UBSDealServer_Startup_Activity. This service check normally has a check period defined of:

sunday 23:12-23:15
monday 07:32-07:35,23:12-23:15
tuesday 07:32-07:35,23:12-23:15
wednesday 07:32-07:35,23:12-23:15
thursday 07:32-07:35,23:12-23:15
friday 07:32-07:35

So how can it have sent an alert out - Mon Jan 25 23:14:47 2016 - when it was in SERVICE DOWNTIME? The logs show it was in downtime as well.

[Mon Jan 25 00:00:00 2016] CURRENT SERVICE STATE: TDUKUBS01;UBSDealServer_Startup_Activity;OK;HARD;1;OK: 'Execution Report' and 'Received on Exchange' found in log file UBSDealServer.log within last 5 mins
[Mon Jan 25 23:09:59 2016] SERVICE DOWNTIME ALERT: TDUKUBS01;UBSDealServer_Startup_Activity;STARTED; Service has entered a period of scheduled downtime
[Mon Jan 25 23:14:47 2016] SERVICE ALERT: TDUKUBS01;UBSDealServer_Startup_Activity;CRITICAL;HARD;1;ERROR: 'Execution Report' and 'Received on Exchange' NOT found in log file UBSDealServer.log within last 5 mins
[Mon Jan 25 23:19:59 2016] SERVICE DOWNTIME ALERT: TDUKUBS01;UBSDealServer_Startup_Activity;STOPPED; Service has exited from a period of scheduled downtime

The only thing I can think of is was it something to do with the fact that the SERVICE DOWNTIME started before the actual TIMEPERIOD TRANSITION at Mon Jan 25 23:12:00 2016 and it somehow overrode the SERVICE DOWNTIME and the service check ran?

Any help would be appreciated, thanks.

Re: Nagios alerts when host has SERVICE DOWNTIME?

Posted: Tue Jan 26, 2016 10:24 am
by rkennedy
Can you post the service definition for UBSDealServer_Startup_Activity for us to take a look at?

Re: Nagios alerts when host has SERVICE DOWNTIME?

Posted: Tue Jan 26, 2016 11:07 am
by julian924s
Sure, so the service definition is:

define service{
use remote-service
host_name TDUKUBS01
service_description UBSDealServer_Startup_Activity
max_check_attempts 1
contact_groups ecomm-supp
check_command check_nrpe!UBSDealServer_Startup_Activity
check_period ubslogs1
}


The time period is the one listed above and the template used is remote-service which is:

define service{
name remote-service
use generic-service
is_volatile 0
check_period 24x7
max_check_attempts 3
normal_check_interval 3
notification_interval 60
notifications_enabled 1
retry_check_interval 1
register 0
}

Thanks - Julian.

Re: Nagios alerts when host has SERVICE DOWNTIME?

Posted: Tue Jan 26, 2016 5:23 pm
by tgriep
Can you run the Notifications report for that time period and post that information here?

Re: Nagios alerts when host has SERVICE DOWNTIME?

Posted: Thu Jan 28, 2016 8:13 am
by julian924s
OK, I'm really confused. Just looking and there is nothing around that time for this host. In fact there is nothing for that host all that day? I chose "All notifications" as the filter from 25th to 26th Jan. But the log extract above shows an alert being sent. Even more confused now. And I've just checked the nagios log file /var/log/nagios/archives/nagios-01-26-2016-00.log and the alerts are still in that file.

Thanks.

Re: Nagios alerts when host has SERVICE DOWNTIME?

Posted: Thu Jan 28, 2016 4:28 pm
by tgriep
Here is a quick description of Alerts and Notifications.

When a host or service check is run and there is an issue, the Nagios system will be alerted of the issue. Alerts do not send emails / SMS messages, etc...
If you setup a host or service's notification options and they get an alert, then it will send an email.

When you schedule downtime, the system will continue to run and receive alerts but notifications will NOT be sent during downtime.

Does that make sense?

Re: Nagios alerts when host has SERVICE DOWNTIME?

Posted: Mon Feb 01, 2016 9:20 am
by julian924s
Hi - yes it does make sense but the service was in downtime so when you say "When you schedule downtime, the system will continue to run and receive alerts but notifications will NOT be sent during downtime"

The Nagios server did still send an alert out, or have I missed something?

Thanks - Julian.

Re: Nagios alerts when host has SERVICE DOWNTIME?

Posted: Mon Feb 01, 2016 6:13 pm
by hsmith
Have you been able to reproduce the behavior? Did you make sure the system time on your Nagios server is correct? You can check this with the date command.

Re: Nagios alerts when host has SERVICE DOWNTIME?

Posted: Mon Feb 01, 2016 6:19 pm
by lmiltchev
The Nagios server did still send an alert out, or have I missed something?
Can you show us the actual email notification that you received?

Re: Nagios alerts when host has SERVICE DOWNTIME?

Posted: Mon Feb 01, 2016 6:20 pm
by julian924s
Hi, in all honesty I've not tried to replicate this. Never occurred to me as the original issue was seen by a colleague of mine and asked me to look at it. I can do though. As for the date / time, the server has ntpd configured and points to several of the UK pool.ntp.org servers so should be pretty accurate most of the time, no pun intended!!

Thanks - J.