[Nagios-devel] Nagios Profiler Changes

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

[Nagios-devel] Nagios Profiler Changes

Post by Guest »

Hi Everyone,

As you know I've been hard at work creating a profiler for nagios that is s=
imple, flexible, extensible, fast and above all accurate.

My initial design was to create a collection of global timers gt_* and glob=
al odometers go_* variables that could then be written out to status.dat on=
e by one.
This worked ok but became quickly unwieldy for obvious reasons.

My next design was a linked list of objects containing the timer and the co=
unter, as well as name or event type, this made extensibility a snap, but w=
ould have made a significant impact on speed since we would have to walk th=
e list at best, and at worst do a strcmp on every single object every time =
we wanted to update a stat. So this idea was discarded for the time being.

Finally I had a better idea. Each event type is an integer and even though=
they aren't necessarily close together they would still be appropriate for=
an array index even if it's a sparse one.
So this is the new profiler design.

We have an object containing
elapsed time, counter, enabled

We have an array of these objects indexed by event type
profiler[event].counter++;

Then when we write it out to status.dat we have a very simple loop that loo=
ks to see if the event type is enabled for profiling and outputs it if it i=
s.
The output looks like
PROFILE_COUNTER_EVENT_SERVICE_CHECK=3D100

Nagiostats then looks for the word PROFILE, and then for COUNTER or ELAPSED=
, then adds that to a linked list ala my second design, and outputs via mrt=
g or the normal nagiostats output.

The other major difference is what we are using to calculate time.
In the original design we just used time(), but later we decided we needed =
more resolution so we went to clock(), finally it was discovered that using=
clock would introduce a bug every 72 minutes and so now we just use gettim=
eofday
In the next version I may include clock() time as well but I thought that t=
his would be sufficient for our needs.
Let me know what you think and I'll try to get a patch out ASAP.

Sincerely,
Steve


NOTICE: This email message is for the sole use of the intended recipient(s=
) and may contain confidential and privileged information. Any unauthorized=
review, use, disclosure or distribution is prohibited. If you are not the =
intended recipient, please contact the sender by reply email and destroy al=
l copies of the original message.







This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked