[Nagios-devel] Run away service check latency in 2.12

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

[Nagios-devel] Run away service check latency in 2.12

Post by Guest »

Ok folks so I've finally found how latency is calculated in Nagios 2.12

It looks like in events.c around about line 1002 we have these lines.

gettimeofday(&tv,NULL);
temp_service->latency=3D(double)((double)(tv.tv_sec - event_list_low->run_t=
ime)+(double)(tv.tv_usec/1000)/1000.0);

As you can see latency is literally the difference between now and when the=
check should have run i.e. event_list_low->run_time.
This all seems great until you look a little further down and see that ther=
e are at least 5 conditions that would prevent the check from being run, so=
even though it's latency is updated, it's run time does not get updated.=
=20
This means that as time goes on and those checks get older latency will con=
tinue to increase ad infinitum.
This isn't an issue unless average service check latency is an important st=
at for you which it is around here.
Basically if we have 100 services all OK with 0 latency and then we have on=
e service that doesn't play nice and ends up with 10000 latency, well now w=
e have an average latency of 1000.

The best solution of course is to simply remove the offending service check.
However if like me, you're in a situation where that cannot be done, I hav=
e come up with 2 other possibilities.

Either move the latency update down to where the check ACTUALLY executes, o=
r have it always reschedule checks even if they have failed by moving the r=
escheduling code out of the if(run_event=3D=3DTRUE) block.

I would like to get some feedback on this, since it has seriously been thro=
wing off my stats.

Thanks in advance!

Sincerely,
Steven Morrey
=20


NOTICE: This email message is for the sole use of the intended recipient(s=
) and may contain confidential and privileged information. Any unauthorized=
review, use, disclosure or distribution is prohibited. If you are not the =
intended recipient, please contact the sender by reply email and destroy al=
l copies of the original message.







This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked