Page 1 of 1

Global acknowledgement timeout...

Posted: Wed Jun 20, 2012 1:33 pm
by tutungzone
All,
New to the forums, but been an active user of Nagios for 7 years. We have recently upgraded and have some new checks that we want done, and a question has come up. It may be easier to give an example of what I am looking for than to describe it:

-- Service check hit it's interval and found that a service is in warning, then in critical state and sends notification :shock:
-- Administrator gets the notification and finds that there is nothing he can do at the moment and acknowledges the service issue :?
-- The service does not recover, for many hours, and the administrator has forgotten about it :o
-- Since the acknowledgement the service has now been in this state for 13 hours, and has bombed affecting customers :o

What would be nice (and it may already be possible and I am not aware of it) is if the service did not recover by a timeout interval (or by next standard check interval), the acknowledgement is canceled and it starts sending notifications again. My thought is that if you are to acknowledge an issue, it should be resolved in a period of time... say 10-60mins. If it is not fixed in this period, the service should really be considered for downtime instead.

Can someone help me with this, or is this a function already in place that I can modify some values? If nothing else, would be an excellent feature in the next release, especially with a work-around for previous releases. Your help is appreciated.

Re: Global acknowledgement timeout...

Posted: Wed Jun 20, 2012 2:38 pm
by agriffin
I don't think this is currently possible. You should create an official feature request on Nagios Core's bug tracker, though. The main Nagios developers don't tend to browse this forum (they use the mailing lists instead).