Can you clarify what you mean by that? If the hosts are not set to notify then downtime will really not have an effect on their notifications.MichielvM wrote:In my OP I mentioned that by default all hosts have a no-notify profile. We only react to service checks.
Is it possible that this has something to do with it?
notification troubles after downtime end.
Re: notification troubles after downtime end.
Former Nagios employee
Re: notification troubles after downtime end.
What I mean is that notifications for the host itself are disabled. It's associated checks (i.e. ping/disk/cpu/mem etc.) are enabled to send out when in crticial state.
I would assume that when a host has not recovered (i.e. has not booted correctly) and is is still down after the downtime period has passed, Nagios would notice the service checks not responding and start notifying as soon as the first check interval has passed.
For some reason I did not. History of this host shows a gap between downtimeend and the time that our tech got the system booted.
I like to clear up if we're dealing with a bug here or that I could have prevented this behavior by ...... (fill in the blanks)
I would assume that when a host has not recovered (i.e. has not booted correctly) and is is still down after the downtime period has passed, Nagios would notice the service checks not responding and start notifying as soon as the first check interval has passed.
For some reason I did not. History of this host shows a gap between downtimeend and the time that our tech got the system booted.
I like to clear up if we're dealing with a bug here or that I could have prevented this behavior by ...... (fill in the blanks)
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: notification troubles after downtime end.
Fred - this implies that you're using service escalations?Fred Kroeger wrote:All Notifications are set to 0 so only 1 notification is sent out. We send an email to a ticketing system so sending more than 1 notification is not an option.
Regards Fred
I was not able to reproduce your situation on my machine with services, only with hosts. Perhaps I could if I used a service escalation, but wanted to verify that was the case before I went down that road.
-
Fred Kroeger
- Posts: 588
- Joined: Wed Oct 19, 2011 11:36 pm
- Location: Perth, Western Australia
- Contact:
Re: notification troubles after downtime end.
Sorry - I confused you and used the wrong terminology - All Host & Service Notification Intervals = 0
So a single email notification is sent out plus we have an event handler that forwards the event details to a ticketing system.
This is why if an event occurs during downtime, there are no further notifications generated if the event is still active after downtime ends.
Escalations aren't used as normally events aren't acknowledged anyway, so we would be constantly generating more notifications and tickets if we implemented escalation.
Regards.. Fred
So a single email notification is sent out plus we have an event handler that forwards the event details to a ticketing system.
This is why if an event occurs during downtime, there are no further notifications generated if the event is still active after downtime ends.
Escalations aren't used as normally events aren't acknowledged anyway, so we would be constantly generating more notifications and tickets if we implemented escalation.
Regards.. Fred
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: notification troubles after downtime end.
That's very strange, even with notification_interval set to 0 I still cannot reproduce the behavior you're seeing. As soon as my service comes off downtime I get a notification. This is a basic install with a basic httpd check to a remote host. I'm entering downtime, simulating an outage during the downtime window by stopping the httpd daemon then never re-enabling it. Is this similar to your scenario?
-
Fred Kroeger
- Posts: 588
- Joined: Wed Oct 19, 2011 11:36 pm
- Location: Perth, Western Australia
- Contact:
Re: notification troubles after downtime end.
Yes - I get a notification that Downtime has ended - but not a separate Host/Service Alert notification or a trigger for the Event Handler.
Fred
Fred
Re: notification troubles after downtime end.
Can you show us the host/service config along with all other relevant configs (template, command, contact, notification handler, etc.)?
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: notification troubles after downtime end.
Fred - I have turned off d in notification options so I am not getting a downtime ended notification. I'm getting a true service critical notification after the downtime for the service has ended but while the service remains in a critical hard state.
It sounds to me like that's not what you're getting, and that's why I'm now super confused.
It sounds to me like that's not what you're getting, and that's why I'm now super confused.
-
Fred Kroeger
- Posts: 588
- Joined: Wed Oct 19, 2011 11:36 pm
- Location: Perth, Western Australia
- Contact:
Re: notification troubles after downtime end.
What I have explained is what I obeserved in a previous version of Nagios. It was easy to observe as I could see no entries in the Notifications screen for a Host Down after downtime ended.
As I haven't seen any reference to this being fixed in the recent version Change Logs I have assumed that this is still the case.
OK - sounds like I have to retest.
Regards.... Fred
As I haven't seen any reference to this being fixed in the recent version Change Logs I have assumed that this is still the case.
OK - sounds like I have to retest.
Regards.... Fred
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: notification troubles after downtime end.
Fred,
After talking with Box293 it was made clear to me that you and I may just have similar behavior but discrepancies in our jargon.
What I'm able to do is have my service that failed and did not recover (stayed HARD and not OK) before the expiration of the downtime notify within the retry_interval, which is what I would *expect* the behavior to be. Box293 was suggesting that you might be seeking an immediate notification triggered by the the exit of downtime AND the continued HARD not OK state, but not as a result of configuring notification_options 's'?
Is that the case? I think my confusion came from some earlier context:
I guess the only difference between the 2 would be that one scenario would involve submitting a bug request, the other would be a feature request. The scenario where that an immediate notification would be sent upon departure from downtime but with the service in a not OK HARD state was never part of the software, at least not to the best of my knowledge.
After talking with Box293 it was made clear to me that you and I may just have similar behavior but discrepancies in our jargon.
What I'm able to do is have my service that failed and did not recover (stayed HARD and not OK) before the expiration of the downtime notify within the retry_interval, which is what I would *expect* the behavior to be. Box293 was suggesting that you might be seeking an immediate notification triggered by the the exit of downtime AND the continued HARD not OK state, but not as a result of configuring notification_options 's'?
Is that the case? I think my confusion came from some earlier context:
But that is MichielvM describing his issue which may be entirely different from yours.MichielvM wrote:Nagios would notice the service checks not responding and start notifying as soon as the first check interval has passed.
I guess the only difference between the 2 would be that one scenario would involve submitting a bug request, the other would be a feature request. The scenario where that an immediate notification would be sent upon departure from downtime but with the service in a not OK HARD state was never part of the software, at least not to the best of my knowledge.