Host in Downtime but receiving host notification

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
brdr
Posts: 312
Joined: Mon Jun 02, 2014 12:49 pm

Host in Downtime but receiving host notification

Post by brdr »

Hi,

We have XI 2014R2.7.

Today we have a remote site experience network issue. The network devices are in a host group. I was asked to put the devices in Downtime. To do this I did:

- Details -> Hostgroup Summary -> Find the host group and View Hostgroup Commands -> Schedule downtime for all services in the hostgroup, then once fixed period was set I hit the check bos 'Schedule Downtime for Hosts Too'.

I went int HOME -> Host/Service detail and the comments for these hosts which read:
By Nagios Administrator at 2015-08-21 14:47:44
This host has been scheduled for fixed downtime from 08-21-2015 14:42:59 to 08-21-2015 19:00:00. Notifications for the host will not be sent out during that time period.


Do you know why a hosts recovery notification would be sent out?

Thx
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Host in Downtime but receiving host notification

Post by Box293 »

Was the recovery notification sent after the downtime period ended?

In XI, find the Service and click on it
There are four icons at the top
The second icon is "View Service Notifications"
The third icon is "View Service History"

For both of these icons:
Click the icon
Change the Period to This Week
Click Update
Take a screenshot

Please show us both screenshots.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
brdr
Posts: 312
Joined: Mon Jun 02, 2014 12:49 pm

Re: Host in Downtime but receiving host notification

Post by brdr »

There was no issue with any services, just host and no recovery sent out after downtime period ended.

You can see the lines in bold below that host recovery notification went out during period of scheduled downtime.

[Fri Aug 21 14:43:28 2015] HOST DOWNTIME ALERT: igm01.londen01;STARTED; Host has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;eth1 Status;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;eth1 Bandwidth;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Swap Usage;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Memory Usage;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Last Rebooted;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Disk Volumes Usage;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_DNS Lookup;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] HOST DOWNTIME ALERT: igm01.londen01;STARTED; Host has entered a period of scheduled downtime
[Fri Aug 21 15:00:34 2015] HOST ALERT: igm01.londen01;UP;HARD;1;OK - 10.48.254.40: rta 76.767ms, lost 0%
[Fri Aug 21 15:00:34 2015] HOST NOTIFICATION: it-netops;igm01.londen01;UP;xi_host_notification_handler;OK - 10.48.254.40: rta 76.767ms, lost 0%
[Fri Aug 21 15:00:34 2015] HOST NOTIFICATION: test;igm01.londen01;UP;xi_host_notification_handler;OK - 10.48.254.40: rta 76.767ms, lost 0%

[Fri Aug 21 15:00:44 2015] SERVICE ALERT: igm01.londen01;_Memory Usage;OK;HARD;3;Memory buffers: 9%used(173MB/1939MB) (<80%) : OK
[Fri Aug 21 15:00:54 2015] SERVICE ALERT: igm01.londen01;_DNS Lookup;OK;HARD;3;DNS OK: 0.243 seconds response time. solarwinds.com returns 74.115.13.20
[Fri Aug 21 15:01:03 2015] SERVICE ALERT: igm01.londen01;_Swap Usage;OK;HARD;3;Swap space: 0%used(0MB/2000MB) (<80%) : OK
[Fri Aug 21 15:01:44 2015] SERVICE ALERT: igm01.londen01;eth1 Status;OK;HARD;5;OK: Interface eth1 (index 12) is up.
[Fri Aug 21 15:10:14 2015] SERVICE ALERT: igm01.londen01;_Disk Volumes Usage;OK;HARD;3;/dev/shm: 0%used(0MB/1MB) /storage: 2%used(3665MB/235817MB) /tmpfs: 5%used(0MB/5MB) /reporting: 3%used(33MB/1027MB) /config: 3%used(33MB/1027MB) /: 47%used(1777MB/3743MB) (<98%) : OK
[Fri Aug 21 15:17:14 2015] HOST ALERT: igm01.londen01;UNREACHABLE;SOFT;1;CRITICAL - 10.48.254.40: rta nan, lost 100%
[Fri Aug 21 15:17:44 2015] HOST ALERT: igm01.londen01;UP;SOFT;2;OK - 10.48.254.40: rta 76.687ms, lost 0%
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Swap Usage;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;eth1 Bandwidth;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE NOTIFICATION: test;igm01.londen01;eth1 Bandwidth;DOWNTIMEEND (OK);xi_service_notification_handler;OK - Current BW in: 0Mbps Out: .01Mbps
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;eth1 Status;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Last Rebooted;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE NOTIFICATION: test;igm01.londen01;_Last Rebooted;DOWNTIMEEND (OK);xi_service_notification_handler;OK - device is up since 12d 11h 58m 59s
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Memory Usage;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Disk Volumes Usage;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_DNS Lookup;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE NOTIFICATION: test;igm01.londen01;_DNS Lookup;DOWNTIMEEND (OK);xi_service_notification_handler;DNS OK: 0.088 seconds response time. solarwinds.com returns 74.115.13.20
[Fri Aug 21 23:59:59 2015] HOST DOWNTIME ALERT: igm01.londen01;STARTED;per Tom. Network troubleshooting continues
~
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: Host in Downtime but receiving host notification

Post by jdalrymple »

I can't recreate the problem. Can you on your system?

There is one thing I know - with flexible downtime there was (and continues to be in 4.0.8) a bug where the downtime didn't enact until after the first notification. This doesn't read like that though, and you indicated it was indeed a fixed downtime. I've tried to recreate here, I simply can't do it.

Is it safe to assume that the output we're looking at is `grep igm01.londen01 nagios.log` IN ITS ENTIRETY for that timeperiod run through a script that converts the time to human readable?
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Host in Downtime but receiving host notification

Post by Box293 »

brdr wrote:There was no issue with any services, just host and no recovery sent out after downtime period ended.

You can see the lines in bold below that host recovery notification went out during period of scheduled downtime.
If I'm understanding this correctly:
  • Downtime started
    Host went down during downtime
    Host came back up during downtime
    Downtime ended
Do you expect a recovery message to be sent AFTER the downtime ended when the host recovered during the downtime period? If that is what you want then Nagios does not work this way.

With your bold lines, did you actually receive these notifications?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
brdr
Posts: 312
Joined: Mon Jun 02, 2014 12:49 pm

Re: Host in Downtime but receiving host notification

Post by brdr »

Almost...

Downtime started for all hosts/services in the host group
While in downtime a host recovered (it was down before downtime started)
Host Recovery Notifications were received while in downtime
Downtime ended

I did not expect a recovery message after downtime and did not expect a notification while in downtime.

I can try this again (put a host group (hosts/services) in downtime) this week and see if this is repeatable.

Keep you posted. Thanks.
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: Host in Downtime but receiving host notification

Post by hsmith »

brdr wrote:Almost...

Downtime started for all hosts/services in the host group
While in downtime a host recovered (it was down before downtime started)
Host Recovery Notifications were received while in downtime
Downtime ended

I did not expect a recovery message after downtime and did not expect a notification while in downtime.

I can try this again (put a host group (hosts/services) in downtime) this week and see if this is repeatable.

Keep you posted. Thanks.
Sounds good, just let us know.
Former Nagios Employee.
me.
brdr
Posts: 312
Joined: Mon Jun 02, 2014 12:49 pm

Re: Host in Downtime but receiving host notification

Post by brdr »

Please close. I think this had to do with timing.. scheduling of the host check and scheduling downtime. If this changes i will open back up.

Thx
Locked