Nagios alerts when host has SERVICE DOWNTIME?
Posted: Tue Jan 26, 2016 5:58 am
Hi - hoping someone can answer this for me as it's got me puzzled. Running Nagios core 4.0.8 on SuSE Linux and have a weird scenario where a service on a host that was in SERVICE DOWNTIME still sent an alert out - how can this be? A bit of background info for you.
The SERVICE DOWNTIME was scheduled for 23:10 -- 23:20 on the service check called UBSDealServer_Startup_Activity. This service check normally has a check period defined of:
sunday 23:12-23:15
monday 07:32-07:35,23:12-23:15
tuesday 07:32-07:35,23:12-23:15
wednesday 07:32-07:35,23:12-23:15
thursday 07:32-07:35,23:12-23:15
friday 07:32-07:35
So how can it have sent an alert out - Mon Jan 25 23:14:47 2016 - when it was in SERVICE DOWNTIME? The logs show it was in downtime as well.
[Mon Jan 25 00:00:00 2016] CURRENT SERVICE STATE: TDUKUBS01;UBSDealServer_Startup_Activity;OK;HARD;1;OK: 'Execution Report' and 'Received on Exchange' found in log file UBSDealServer.log within last 5 mins
[Mon Jan 25 23:09:59 2016] SERVICE DOWNTIME ALERT: TDUKUBS01;UBSDealServer_Startup_Activity;STARTED; Service has entered a period of scheduled downtime
[Mon Jan 25 23:14:47 2016] SERVICE ALERT: TDUKUBS01;UBSDealServer_Startup_Activity;CRITICAL;HARD;1;ERROR: 'Execution Report' and 'Received on Exchange' NOT found in log file UBSDealServer.log within last 5 mins
[Mon Jan 25 23:19:59 2016] SERVICE DOWNTIME ALERT: TDUKUBS01;UBSDealServer_Startup_Activity;STOPPED; Service has exited from a period of scheduled downtime
The only thing I can think of is was it something to do with the fact that the SERVICE DOWNTIME started before the actual TIMEPERIOD TRANSITION at Mon Jan 25 23:12:00 2016 and it somehow overrode the SERVICE DOWNTIME and the service check ran?
Any help would be appreciated, thanks.
The SERVICE DOWNTIME was scheduled for 23:10 -- 23:20 on the service check called UBSDealServer_Startup_Activity. This service check normally has a check period defined of:
sunday 23:12-23:15
monday 07:32-07:35,23:12-23:15
tuesday 07:32-07:35,23:12-23:15
wednesday 07:32-07:35,23:12-23:15
thursday 07:32-07:35,23:12-23:15
friday 07:32-07:35
So how can it have sent an alert out - Mon Jan 25 23:14:47 2016 - when it was in SERVICE DOWNTIME? The logs show it was in downtime as well.
[Mon Jan 25 00:00:00 2016] CURRENT SERVICE STATE: TDUKUBS01;UBSDealServer_Startup_Activity;OK;HARD;1;OK: 'Execution Report' and 'Received on Exchange' found in log file UBSDealServer.log within last 5 mins
[Mon Jan 25 23:09:59 2016] SERVICE DOWNTIME ALERT: TDUKUBS01;UBSDealServer_Startup_Activity;STARTED; Service has entered a period of scheduled downtime
[Mon Jan 25 23:14:47 2016] SERVICE ALERT: TDUKUBS01;UBSDealServer_Startup_Activity;CRITICAL;HARD;1;ERROR: 'Execution Report' and 'Received on Exchange' NOT found in log file UBSDealServer.log within last 5 mins
[Mon Jan 25 23:19:59 2016] SERVICE DOWNTIME ALERT: TDUKUBS01;UBSDealServer_Startup_Activity;STOPPED; Service has exited from a period of scheduled downtime
The only thing I can think of is was it something to do with the fact that the SERVICE DOWNTIME started before the actual TIMEPERIOD TRANSITION at Mon Jan 25 23:12:00 2016 and it somehow overrode the SERVICE DOWNTIME and the service check ran?
Any help would be appreciated, thanks.