[Nagios-devel] passive check expire race condition
Posted: Tue Jul 31, 2007 7:53 am
[1185891648] SERVICE ALERT: emperor20.cs.wisc.edu;what;OK;HARD;1;OK: Script ran.
[1185895333] Warning: The results of service 'what' on host 'emperor20.cs.wisc.edu' are stale by 10 seconds (threshold=3700 seconds). I'm forcing an immediate check of the service.
[1185895335] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;emperor20.cs.wisc.edu;what;0;OK: Script ran.
[1185895343] SERVICE ALERT: emperor20.cs.wisc.edu;what;CRITICAL;HARD;1;CRITICAL: Test failed. Passive check didn't send info.
It looks like, once the stale condition is noticed, it about takes 10
seconds to run the alternate active/fail check. If a passive check comes
through in that time setting the state to OK, the fail check overrides it.
Is there a way to make the forced check verify that a check hasn't come
through in the meantime? Or to put a semaphore on the check so that the
new passive check isn't processed until the forced check completes?
--
Michelle
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]