Re: [Nagios-devel] fun with (silent) change from HARD to SOFT state

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] fun with (silent) change from HARD to SOFT state

Post by Guest »

Michal Svoboda wrote:
> Hello,
>
> I've discovered a weird behavior, which can be replicated thus:
>
> 1. Let a service be configured for max attempts N before going to HARD
> non-ok state
>
> 2. Make the service fail and wait for N checks to pass (ie. until the
> service enters N/N HARD non-ok state); at this point notifications
> are sent, etc.
>
> 3. Change the configuration of the service to have M > N max attempts
> and restart nagios
>
> 4. Now the state of the service is N/M _HARD_ non-ok
>
> 5. If the N+1th check results in non-ok, then the service state goes to
> N+1/M _SOFT_
>
> 6. If some future check results in ok, then the service performs a SOFT
> recovery; this results at least in no recovery notifications
>
> 6a. if the condition in (5) does not occur, ie. the N+1th check results
> immediately in ok, the service still performs a SOFT recovery from
> an apparently HARD state (even according to the logs)
>
> Now, one way to look at this behavior is that it is logical, because
> I've fiddled with the config, and I can expect anomalies and blah blah.
>
> Another way to look at it is that there have been notifications sent in
> step (2), yet there are no recovery notifications; in other words, once
> the sirens have been sounded (and the fire brigade is on the way, and
> the president is being woken up), they should be also properly shut off.
>
> So the question is, whether or not introduce a patch that prevents
> entering a SOFT state once a service (or a host) is already in a HARD
> non-ok state?
>
>
> With regards,
> Michal Svoboda

Nice catch. I just added some code that will readjust current check
attempt at startup if the host/service was in a hard problem state.
That will accommodate config changes related to max check attempts that
are made before (re)start.

- Ethan Galstad





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked