Re: [Nagios-devel] Max concurrent checks - spreading the next_time
-
Guest
Re: [Nagios-devel] Max concurrent checks - spreading the next_time
On 11 Jun 2009, at 08:55, Hiren Patel wrote:
> I haven't seen this being an issue with our setup, but if there's a
> way
> to simulate it and you'd like me to test this for you, I'd be glad to.
Thanks for the offer.
This is the test case:
* set max_concurrent_checks=1 in nagios.cfg
* create a host with 3 services with a check_interval of 1 minute
* restart nagios
* go to the host page and schedule a check for all services on the
host (this makes all the services run at the same time)
* tail nagios.log. Should see "Max concurrent service checks (1)
has been reached"
* on the host page, notice the last run time. Only one will be
updated after 1 minute. All services get scheduled for the next time
at the same time, and after the next minute, only one of those will
have the last check time changed
I've just committed a patch into CVS HEAD. This nudges the time ahead
by 5 + random(10) seconds. I've also included a test case which
ensures that the nudge factor is added in these cases.
nagios.log will also have an entry which lists the affected service.
If you get this message a lot on a regular system, then you need to
consider increasing the max_concurrent_checks value.
I'd be grateful if you could try this out.
Thinking some more, setting the next check time ahead doesn't really
make sense, because the latency value does not reflect the fact that
this active service's check time was delayed. Maybe this should be
implemented as a remove of the event from the queue, and then re-added
with a nudged event run time but the old service->next_check time.
Anyhow, this should be better than it was.
Ton
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]