Re: [Nagios-devel] Event Profiler Timers?

Guest · Post by **Guest** » Wed Jun 10, 2009 7:48 am

Steven D. Morrey wrote:
>>> Would it be better (i.e. worth the overhead), to use clock_t =
>>> clock() to get the entry and exit?
>> Could be. But then we might as well use gettimeofday() from the
>> start so the wrap-around case at least gets exercised a lot instead
>> of every 72 minutes.
>
> I don't follow you on that, can you elaborate on what you mean by the
> wrap-around case and it being exercised, because I am not aware of it
> and therefore have not taken it into consideration. My primary
> reasons for wanting to use clock instead of gettimeofday were speed
> and precision.
>

Righto. clock(3) returns a value that represents seconds in 1/1000000
"scale". This means that, for 32-bit architectures, it will wrap around
every 71m 34.96s. Consider this:

start = clock();
do_stuff();
end = clock();

if (end start) {
/* normal case */
delta = (double)(end - start);
}
if (end With gettimeofday we are talking about how much calendar time has
> been spent, whereas with clock we are talking about how many ticks
> were used by the program and then dividing that by clocks_per_second
> to get a process time. I believe that clock would be more accurate
> since it filters out the cost of other processes running on the
> system.
>

True, but the meaning of it would be hard to correlate against latency,
since some events spend an inordinate amount of time switching contexts
while others are fairly cpu-intensive but achieve a lot in wallclock
time. Parsing passive check-results is CPU-intensive, but in wallclock
time we can easily parse 50 or 100 of them in the same time it takes to
fork and fire up a plugin. I'm not sure which of the values would be
preferrable here, and I'm also not sure which of the processes gets the
CPU-time attributed to it.

> Another consideration would be the amount of overhead involved. It
> appears to me that
>
> elapsed_time = (double) (end - start) / CLOCKS_PER_SECOND; total_time
> += elapsed_time
>

But this is in-correct once every 72 minutes when end Would involve significantly less over head than
>
> elapsed_time = (end.tv_usec - start.tv_usec) + ((end.tv_sec*1000)
> -(start.tv_sec*1000)); total_time += (double) (elapsed_time / 1000);
>

This is potentially incorrect for every check, but errors in this
algorithm would quite quickly be spotted, and tv_usec doesn't really
wrap; It just resets, which means we can do all the math with ints
and longs instead of off-loading it to the mcpu. Otoh, Nagios
really makes no use of the mcpu at all, so we could actually gain
performance by *not* using integer math. Some testing would be nice.
1 billion iterations should show some measurable delta, I think.

Note that this will be different per architecture, but I think it's
safe to assume that x86 is by

...[email truncated]...

This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]