Re: [Nagios-devel] Logging API revamp

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] Logging API revamp

Post by Guest »

Ethan Galstad wrote:
> Andreas Ericsson wrote:
>> So, I started looking into revamping the event queue logic, but ended up
>> with a migraine from the cumbersome way logging is done, so I decided to
>> try doing something about it, and the attached 3-patch series is the
>> result from it.
>>
>> It compiles alright, both for nagios and the cgi's. I haven't done much
>> in the way of checking past that though, so testing would be welcome.
>>
>> Given that the patches don't change much in the way of logic, they
>> shouldn't really affect anything in significant way.
>>
> [snip]
>
> Thanks for the patches - they are excellent ideas. I'll get them
> implemented when I get back to the US later this week.
>

Anytime. I guess the conference spurred some Nagios-hackativity into
me ;-)

> For the event queue, I was thinking that a skip list structure might be
> best for efficiency (http://en.wikipedia.org/wiki/Skip_list). The event
> queue is used in primarily two situations:
>
> 1. Popping events from the head of the list to be executed
> 2. Inserting events into the list (mid- or endpoint).
>
> #1 is very efficient with a linked list, but performance with #2 can be
> quite bad in large lists. Since a check event usually appears for each
> host/service that is defined, this can lead to bad performance - O(n^2)
> I believe - with large installations. A skip list would bring the
> performance closer to O(log n).
>
> Anyone have comments/experiences they care to share about the
> performance of skip lists and/or better alternatives?
>

A skiplist would probably do wonders. I've been experimenting with one
now, actually using the timestamp for when next to execute is
the key to the basic element. Using max_normal_check_interval as
num_buckets seems to be the best bet so far, since that would make sure
one has a decent dispersion while keeping the buckets nearly saturated.

It's probably best to make sure the bucket-count is within reasonable
limites, such as 256 and 1024 buckets (17 minutes), and possibly
keeping num_buckets to a power of 2 to avoid modulo operations, which
are quite slow on some CPU's.

Although for performance reasons, I think it'd be better to add the
scheduled event slot to the host/service structs. That way you can
always remove it from the list with some simple pointer-fiddling.
The memory impact might hit large networks fairly badly, but those
should be running on pretty beefy hardware anyways, so 500KiB more
or less won't matter all that much.

--
Andreas Ericsson [email protected]
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked