Checks before notification for OK status

Post by **eloyd** » Sat Jun 13, 2015 6:54 am

I can think easily in terms of event handlers, so it would take me about two minutes to write up an event handler that keeps track of how many OKs have occurred in a row using a flat file counter. I'd then send a command back to Nagios to do the notification, disable recovery notifications on the service, and will have completely accomplished what the OP has asked for in a few minutes using event handlers.

However, the OP may not think as easily in event handlers as I do, and is looking for a recovery version of max_check_attempts to delay recovery notification by two, three, or more iterations without engaging flap detection (meaning, without it having to go non-OK at some point first).

I see this as being something that has potential for use within the Nagios framework, even though it is creatable with an event handler, so I second the nomination for a feature request. Just because event handlers can be used to do almost every form of notification that Nagios does now, doesn't mean it's the easiest path forward, so I can foresee some circumstances when this might be preferred over writing an event handler.

In the meantime, OP, with some minimal shell scripting, you can do exactly what you want. Here's the idea, in English, not code:

Code: Select all

#!/bin/sh
#
process command line arguments from Nagios

extract contents of /tmp/$servicename.tmp file, which contains the number of consecutive times this service has been OK
check for file existence and so forth, first
if the current state is not OK, then write "0" to the file and exit

since we are OK, increment the count and write it back out to the file
decide whether or not to notify
if not, exit

since we are going to notify, send a custom notification using Nagios (http://old.nagios.org/developerinfo/externalcommands/commandinfo.php?command_id=135)

Obviously, this is not efficient, but you get the idea.

abrist · Post by **abrist** » Mon Jun 15, 2015 10:47 am

eloyd wrote: Obviously, this is not efficient, but you get the idea.

This is actually my biggest concern with using event handlers for this, as it adds quite a number of ops for every check on every host. If flapping thresholds do not meet their needs, this is a good shim in the meantime, though a feature request is probably a good idea:
http://tracker.nagios.org

elkali · Post by **elkali** » Mon Jun 15, 2015 11:15 am

eloyd wrote:I can think easily in terms of event handlers,

[...]

I could have not possibly explained it all better. Thanks for the understanding and the support.

eloyd wrote:the OP [...] is looking for a recovery version of max_check_attempts to delay recovery notification by two, three, or more iterations

Exactly what I'd like to get.

eloyd wrote:so I second the nomination for a feature request

Thanks a lot!

Post by **eloyd** » Mon Jun 15, 2015 11:20 am

You should come to the 2015 Nagios World Conference in September and you can talk directly with the developers to help them understand what you want to accomplish! If you register (https://conference.nagios.com/register) tell them eloyd sent you!

elkali · Post by **elkali** » Mon Jun 15, 2015 11:31 am

eloyd wrote:You should come to the 2015 Nagios World Conference in September and you can talk directly with the developers to help them understand what you want to accomplish! If you register (https://conference.nagios.com/register) tell them eloyd sent you!

That'd be awesome but unfortunately I live in Berlin and since my company won't cover for the registration fee it's not only far away geographically but also far away from my budget

Thanks anyway!

Post by **eloyd** » Mon Jun 15, 2015 11:33 am

I understand. You should tell your employer how wonderful it would be though, and try to get them to send you! I'll be there, along with a lot of other people you'll see here in the forums.

abrist · Post by **abrist** » Mon Jun 15, 2015 12:52 pm

Also, the presentations are fantastic!

Post by **eloyd** » Mon Jun 15, 2015 1:03 pm

@abrist is just saying that because he knows I am going to be speaking.

tmcdonald · Post by **tmcdonald** » Mon Jun 15, 2015 1:07 pm

Back to the issue at hand, if you would like to see functionality added to Nagios Core, please post an issue here: https://github.com/NagiosEnterprises/nagioscore

That way the devs can have it on their radar. No guarantees it will be implemented, but the first step is getting it in their sights.

elkali · Post by **elkali** » Mon Jun 15, 2015 1:10 pm

tmcdonald wrote:Back to the issue at hand, if you would like to see functionality added to Nagios Core, please post an issue here: https://github.com/NagiosEnterprises/nagioscore

That way the devs can have it on their radar. No guarantees it will be implemented, but the first step is getting it in their sights.

https://github.com/NagiosEnterprises/na ... /issues/46

It's there already

Thanks!

Nagios Support Forum

Checks before notification for OK status

Re: Checks before notification for OK status

Re: Checks before notification for OK status

Re: Checks before notification for OK status

Re: Checks before notification for OK status

Re: Checks before notification for OK status

Re: Checks before notification for OK status

Re: Checks before notification for OK status

Re: Checks before notification for OK status

Re: Checks before notification for OK status

Re: Checks before notification for OK status