Page 1 of 1

Notification delay issues

Posted: Tue May 28, 2013 1:11 pm
by c.slagel
So here's my situation:

a service enters a hard state after 3 checks, and then it triggers an event handler to restart the service on its host.

I have an initial notification delay in place so that I don't get alerted unless the restart fails to bring the service back to a stable state.

However, very often (not always) Nagios ignores the delay and triggers the notification as soon as the service enters a hard state, then I get the recovery notification shortly after.

Any input on how to fix this? It's pretty annoying to get so many false positives.

Thanks!

Re: Notification delay issues

Posted: Tue May 28, 2013 1:14 pm
by slansing
How far out is your notification delayed? It is possible the event handler is running into latency issues at times.

Re: Notification delay issues

Posted: Tue May 28, 2013 1:19 pm
by scottwilkerson
One thing to also note is that the notification delayed time starts from the time of the last known OK state, not from the time the object first went down.

Often people make the mistake of setting the notification delay of too low a number, it must be longer than

Code: Select all

check_interval + (max_check_attempts * retry_interval)


to have any affect at all...

Re: Notification delay issues

Posted: Tue May 28, 2013 1:52 pm
by c.slagel
Latency might be an issue, but as far as timing we should only need a total of 17 minutes and the delay is set to 22, so that shouldn't be an issue... Next time it triggers I'll screen shot specific settings and share to give a better picture.

Re: Notification delay issues

Posted: Tue May 28, 2013 2:23 pm
by abrist
Great, we await your findings.