Re: [Nagios-devel] passive check expire race condition

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] passive check expire race condition

Post by Guest »

Michelle Craft wrote:
> [1185891648] SERVICE ALERT: emperor20.cs.wisc.edu;what;OK;HARD;1;OK: Script ran.
> [1185895333] Warning: The results of service 'what' on host 'emperor20.cs.wisc.edu' are stale by 10 seconds (threshold=3700 seconds). I'm forcing an immediate check of the service.
> [1185895335] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;emperor20.cs.wisc.edu;what;0;OK: Script ran.
> [1185895343] SERVICE ALERT: emperor20.cs.wisc.edu;what;CRITICAL;HARD;1;CRITICAL: Test failed. Passive check didn't send info.
>
> It looks like, once the stale condition is noticed, it about takes 10
> seconds to run the alternate active/fail check. If a passive check comes
> through in that time setting the state to OK, the fail check overrides it.
>
> Is there a way to make the forced check verify that a check hasn't come
> through in the meantime? Or to put a semaphore on the check so that the
> new passive check isn't processed until the forced check completes?
>
> --
> Michelle
>

This has been on my todo list for a while, and its finally done. :-) A
fix was just posted to the HEAD branch of CVS (Nagios 3) that will cause
freshness check results to be ignored if a passive check arrived between
1) the time the service was detected as stale and a check was initiated
and 2) the time the freshness check results are processed.


Ethan Galstad
Nagios Developer
___
Email: [email protected]
Web: www.nagios.org





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked