Re: [Nagios-devel] Nagios Profiler Changes

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] Nagios Profiler Changes

Post by Guest »

Steven D. Morrey wrote:
> Hi Everyone,
>
> As you know I've been hard at work creating a profiler for nagios that is simple, flexible, extensible, fast and above all accurate.
>
> My initial design was to create a collection of global timers gt_* and global odometers go_* variables that could then be written out to status.dat one by one.
> This worked ok but became quickly unwieldy for obvious reasons.
>
> My next design was a linked list of objects containing the timer and the counter, as well as name or event type, this made extensibility a snap, but would have made a significant impact on speed since we would have to walk the list at best, and at worst do a strcmp on every single object every time we wanted to update a stat. So this idea was discarded for the time being.
>
> Finally I had a better idea. Each event type is an integer and even though they aren't necessarily close together they would still be appropriate for an array index even if it's a sparse one.
> So this is the new profiler design.
>
> We have an object containing
> elapsed time, counter, enabled
>
> We have an array of these objects indexed by event type
> profiler[event].counter++;
>
> Then when we write it out to status.dat we have a very simple loop that looks to see if the event type is enabled for profiling and outputs it if it is.
> The output looks like
> PROFILE_COUNTER_EVENT_SERVICE_CHECK=100
>
> Nagiostats then looks for the word PROFILE, and then for COUNTER or ELAPSED, then adds that to a linked list ala my second design, and outputs via mrtg or the normal nagiostats output.
>
> The other major difference is what we are using to calculate time.
> In the original design we just used time(), but later we decided we needed more resolution so we went to clock(), finally it was discovered that using clock would introduce a bug every 72 minutes and so now we just use gettimeofday
> In the next version I may include clock() time as well but I thought that this would be sufficient for our needs.
> Let me know what you think and I'll try to get a patch out ASAP.
>
I would definitely like this feature, I think it could help me when
diagnosing issues.
so we're going to use tv_usec with gettimeofday? using tv_sec would be
the same as time() otherwise?





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked