Hi,
We have XI 2014R2.7.
Today we have a remote site experience network issue. The network devices are in a host group. I was asked to put the devices in Downtime. To do this I did:
- Details -> Hostgroup Summary -> Find the host group and View Hostgroup Commands -> Schedule downtime for all services in the hostgroup, then once fixed period was set I hit the check bos 'Schedule Downtime for Hosts Too'.
I went int HOME -> Host/Service detail and the comments for these hosts which read:
By Nagios Administrator at 2015-08-21 14:47:44
This host has been scheduled for fixed downtime from 08-21-2015 14:42:59 to 08-21-2015 19:00:00. Notifications for the host will not be sent out during that time period.
Do you know why a hosts recovery notification would be sent out?
Thx
Host in Downtime but receiving host notification
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Host in Downtime but receiving host notification
Was the recovery notification sent after the downtime period ended?
In XI, find the Service and click on it
There are four icons at the top
The second icon is "View Service Notifications"
The third icon is "View Service History"
For both of these icons:
Click the icon
Change the Period to This Week
Click Update
Take a screenshot
Please show us both screenshots.
In XI, find the Service and click on it
There are four icons at the top
The second icon is "View Service Notifications"
The third icon is "View Service History"
For both of these icons:
Click the icon
Change the Period to This Week
Click Update
Take a screenshot
Please show us both screenshots.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Host in Downtime but receiving host notification
There was no issue with any services, just host and no recovery sent out after downtime period ended.
You can see the lines in bold below that host recovery notification went out during period of scheduled downtime.
[Fri Aug 21 14:43:28 2015] HOST DOWNTIME ALERT: igm01.londen01;STARTED; Host has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;eth1 Status;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;eth1 Bandwidth;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Swap Usage;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Memory Usage;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Last Rebooted;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Disk Volumes Usage;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_DNS Lookup;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] HOST DOWNTIME ALERT: igm01.londen01;STARTED; Host has entered a period of scheduled downtime
[Fri Aug 21 15:00:34 2015] HOST ALERT: igm01.londen01;UP;HARD;1;OK - 10.48.254.40: rta 76.767ms, lost 0%
[Fri Aug 21 15:00:34 2015] HOST NOTIFICATION: it-netops;igm01.londen01;UP;xi_host_notification_handler;OK - 10.48.254.40: rta 76.767ms, lost 0%
[Fri Aug 21 15:00:34 2015] HOST NOTIFICATION: test;igm01.londen01;UP;xi_host_notification_handler;OK - 10.48.254.40: rta 76.767ms, lost 0%
[Fri Aug 21 15:00:44 2015] SERVICE ALERT: igm01.londen01;_Memory Usage;OK;HARD;3;Memory buffers: 9%used(173MB/1939MB) (<80%) : OK
[Fri Aug 21 15:00:54 2015] SERVICE ALERT: igm01.londen01;_DNS Lookup;OK;HARD;3;DNS OK: 0.243 seconds response time. solarwinds.com returns 74.115.13.20
[Fri Aug 21 15:01:03 2015] SERVICE ALERT: igm01.londen01;_Swap Usage;OK;HARD;3;Swap space: 0%used(0MB/2000MB) (<80%) : OK
[Fri Aug 21 15:01:44 2015] SERVICE ALERT: igm01.londen01;eth1 Status;OK;HARD;5;OK: Interface eth1 (index 12) is up.
[Fri Aug 21 15:10:14 2015] SERVICE ALERT: igm01.londen01;_Disk Volumes Usage;OK;HARD;3;/dev/shm: 0%used(0MB/1MB) /storage: 2%used(3665MB/235817MB) /tmpfs: 5%used(0MB/5MB) /reporting: 3%used(33MB/1027MB) /config: 3%used(33MB/1027MB) /: 47%used(1777MB/3743MB) (<98%) : OK
[Fri Aug 21 15:17:14 2015] HOST ALERT: igm01.londen01;UNREACHABLE;SOFT;1;CRITICAL - 10.48.254.40: rta nan, lost 100%
[Fri Aug 21 15:17:44 2015] HOST ALERT: igm01.londen01;UP;SOFT;2;OK - 10.48.254.40: rta 76.687ms, lost 0%
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Swap Usage;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;eth1 Bandwidth;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE NOTIFICATION: test;igm01.londen01;eth1 Bandwidth;DOWNTIMEEND (OK);xi_service_notification_handler;OK - Current BW in: 0Mbps Out: .01Mbps
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;eth1 Status;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Last Rebooted;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE NOTIFICATION: test;igm01.londen01;_Last Rebooted;DOWNTIMEEND (OK);xi_service_notification_handler;OK - device is up since 12d 11h 58m 59s
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Memory Usage;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Disk Volumes Usage;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_DNS Lookup;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE NOTIFICATION: test;igm01.londen01;_DNS Lookup;DOWNTIMEEND (OK);xi_service_notification_handler;DNS OK: 0.088 seconds response time. solarwinds.com returns 74.115.13.20
[Fri Aug 21 23:59:59 2015] HOST DOWNTIME ALERT: igm01.londen01;STARTED;per Tom. Network troubleshooting continues
~
You can see the lines in bold below that host recovery notification went out during period of scheduled downtime.
[Fri Aug 21 14:43:28 2015] HOST DOWNTIME ALERT: igm01.londen01;STARTED; Host has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;eth1 Status;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;eth1 Bandwidth;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Swap Usage;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Memory Usage;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Last Rebooted;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Disk Volumes Usage;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_DNS Lookup;STARTED; Service has entered a period of scheduled downtime
[Fri Aug 21 14:47:45 2015] HOST DOWNTIME ALERT: igm01.londen01;STARTED; Host has entered a period of scheduled downtime
[Fri Aug 21 15:00:34 2015] HOST ALERT: igm01.londen01;UP;HARD;1;OK - 10.48.254.40: rta 76.767ms, lost 0%
[Fri Aug 21 15:00:34 2015] HOST NOTIFICATION: it-netops;igm01.londen01;UP;xi_host_notification_handler;OK - 10.48.254.40: rta 76.767ms, lost 0%
[Fri Aug 21 15:00:34 2015] HOST NOTIFICATION: test;igm01.londen01;UP;xi_host_notification_handler;OK - 10.48.254.40: rta 76.767ms, lost 0%
[Fri Aug 21 15:00:44 2015] SERVICE ALERT: igm01.londen01;_Memory Usage;OK;HARD;3;Memory buffers: 9%used(173MB/1939MB) (<80%) : OK
[Fri Aug 21 15:00:54 2015] SERVICE ALERT: igm01.londen01;_DNS Lookup;OK;HARD;3;DNS OK: 0.243 seconds response time. solarwinds.com returns 74.115.13.20
[Fri Aug 21 15:01:03 2015] SERVICE ALERT: igm01.londen01;_Swap Usage;OK;HARD;3;Swap space: 0%used(0MB/2000MB) (<80%) : OK
[Fri Aug 21 15:01:44 2015] SERVICE ALERT: igm01.londen01;eth1 Status;OK;HARD;5;OK: Interface eth1 (index 12) is up.
[Fri Aug 21 15:10:14 2015] SERVICE ALERT: igm01.londen01;_Disk Volumes Usage;OK;HARD;3;/dev/shm: 0%used(0MB/1MB) /storage: 2%used(3665MB/235817MB) /tmpfs: 5%used(0MB/5MB) /reporting: 3%used(33MB/1027MB) /config: 3%used(33MB/1027MB) /: 47%used(1777MB/3743MB) (<98%) : OK
[Fri Aug 21 15:17:14 2015] HOST ALERT: igm01.londen01;UNREACHABLE;SOFT;1;CRITICAL - 10.48.254.40: rta nan, lost 100%
[Fri Aug 21 15:17:44 2015] HOST ALERT: igm01.londen01;UP;SOFT;2;OK - 10.48.254.40: rta 76.687ms, lost 0%
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Swap Usage;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;eth1 Bandwidth;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE NOTIFICATION: test;igm01.londen01;eth1 Bandwidth;DOWNTIMEEND (OK);xi_service_notification_handler;OK - Current BW in: 0Mbps Out: .01Mbps
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;eth1 Status;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Last Rebooted;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE NOTIFICATION: test;igm01.londen01;_Last Rebooted;DOWNTIMEEND (OK);xi_service_notification_handler;OK - device is up since 12d 11h 58m 59s
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Memory Usage;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_Disk Volumes Usage;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE DOWNTIME ALERT: igm01.londen01;_DNS Lookup;STOPPED; Service has exited from a period of scheduled downtime
[Fri Aug 21 19:00:00 2015] SERVICE NOTIFICATION: test;igm01.londen01;_DNS Lookup;DOWNTIMEEND (OK);xi_service_notification_handler;DNS OK: 0.088 seconds response time. solarwinds.com returns 74.115.13.20
[Fri Aug 21 23:59:59 2015] HOST DOWNTIME ALERT: igm01.londen01;STARTED;per Tom. Network troubleshooting continues
~
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: Host in Downtime but receiving host notification
I can't recreate the problem. Can you on your system?
There is one thing I know - with flexible downtime there was (and continues to be in 4.0.8) a bug where the downtime didn't enact until after the first notification. This doesn't read like that though, and you indicated it was indeed a fixed downtime. I've tried to recreate here, I simply can't do it.
Is it safe to assume that the output we're looking at is `grep igm01.londen01 nagios.log` IN ITS ENTIRETY for that timeperiod run through a script that converts the time to human readable?
There is one thing I know - with flexible downtime there was (and continues to be in 4.0.8) a bug where the downtime didn't enact until after the first notification. This doesn't read like that though, and you indicated it was indeed a fixed downtime. I've tried to recreate here, I simply can't do it.
Is it safe to assume that the output we're looking at is `grep igm01.londen01 nagios.log` IN ITS ENTIRETY for that timeperiod run through a script that converts the time to human readable?
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Host in Downtime but receiving host notification
If I'm understanding this correctly:brdr wrote:There was no issue with any services, just host and no recovery sent out after downtime period ended.
You can see the lines in bold below that host recovery notification went out during period of scheduled downtime.
- Downtime started
Host went down during downtime
Host came back up during downtime
Downtime ended
With your bold lines, did you actually receive these notifications?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Host in Downtime but receiving host notification
Almost...
Downtime started for all hosts/services in the host group
While in downtime a host recovered (it was down before downtime started)
Host Recovery Notifications were received while in downtime
Downtime ended
I did not expect a recovery message after downtime and did not expect a notification while in downtime.
I can try this again (put a host group (hosts/services) in downtime) this week and see if this is repeatable.
Keep you posted. Thanks.
Downtime started for all hosts/services in the host group
While in downtime a host recovered (it was down before downtime started)
Host Recovery Notifications were received while in downtime
Downtime ended
I did not expect a recovery message after downtime and did not expect a notification while in downtime.
I can try this again (put a host group (hosts/services) in downtime) this week and see if this is repeatable.
Keep you posted. Thanks.
Re: Host in Downtime but receiving host notification
Sounds good, just let us know.brdr wrote:Almost...
Downtime started for all hosts/services in the host group
While in downtime a host recovered (it was down before downtime started)
Host Recovery Notifications were received while in downtime
Downtime ended
I did not expect a recovery message after downtime and did not expect a notification while in downtime.
I can try this again (put a host group (hosts/services) in downtime) this week and see if this is repeatable.
Keep you posted. Thanks.
Former Nagios Employee.
me.
me.
Re: Host in Downtime but receiving host notification
Please close. I think this had to do with timing.. scheduling of the host check and scheduling downtime. If this changes i will open back up.
Thx
Thx