However, the OP may not think as easily in event handlers as I do, and is looking for a recovery version of max_check_attempts to delay recovery notification by two, three, or more iterations without engaging flap detection (meaning, without it having to go non-OK at some point first).
I see this as being something that has potential for use within the Nagios framework, even though it is creatable with an event handler, so I second the nomination for a feature request. Just because event handlers can be used to do almost every form of notification that Nagios does now, doesn't mean it's the easiest path forward, so I can foresee some circumstances when this might be preferred over writing an event handler.
In the meantime, OP, with some minimal shell scripting, you can do exactly what you want. Here's the idea, in English, not code:
Code: Select all
#!/bin/sh
#
process command line arguments from Nagios
extract contents of /tmp/$servicename.tmp file, which contains the number of consecutive times this service has been OK
check for file existence and so forth, first
if the current state is not OK, then write "0" to the file and exit
since we are OK, increment the count and write it back out to the file
decide whether or not to notify
if not, exit
since we are going to notify, send a custom notification using Nagios (http://old.nagios.org/developerinfo/externalcommands/commandinfo.php?command_id=135)