Page 1 of 1

Re: [Nagios-devel] Skiplists (was: Re: Logging API revamp)

Posted: Mon Oct 15, 2007 9:45 am
by Guest

In message ,
Andreas Ericsson writes:

>John P. Rouillard wrote:
>> In message ,
>> Andreas Ericsson writes:
>>
>>> Andreas Ericsson wrote:
>>>> Ethan Galstad wrote:
>>>>> For the event queue, I was thinking that a skip list structure might be
>
>>>>> best for efficiency (http://en.wikipedia.org/wiki/Skip_list). The even
>t
>>>>> queue is used in primarily two situations:
>>>>>
>>>>> 1. Popping events from the head of the list to be executed
>>>>> 2. Inserting events into the list (mid- or endpoint).
>>>>>
>>>>> #1 is very efficient with a linked list, but performance with #2 can be
>>>>> quite bad in large lists. Since a check event usually appears for each
>>>>> host/service that is defined, this can lead to bad performance - O(n^2)
>>>>> I believe - with large installations. A skip list would bring the
>>>>> performance closer to O(log n).
>>>>>
>>>>> Anyone have comments/experiences they care to share about the
>>>>> performance of skip lists and/or better alternatives?
>>>>>
>>>> A skiplist would probably do wonders. I've been experimenting with one
>>>> now, actually using the timestamp for when next to execute
>>> Scratch that. I only glossed the skiplist page you linked, and
>>> misunderstood completely. I've been experimenting with a hash, except
>>> the time used for input is the key. Implementation is trivial and
>>> performance improvement should be fairly close to or better than a
>>> skip-list.
>>
>> Hmm, maybe I am missing something, but if I have an element with a
>> known key, I see how a hash table pointing into a conventional linked
>> list will speed up access. But doesn't case 2 take the form:
>>
>> insert node X that should run at time t into the linked list
>>
>> mean a sequential scan of the linked list? How do you know what time
>> to hash to find the node before time t?
>
>It's a sequential scan of a small part of the linked list, just like
>any non-perfect hash. Since perfect hashes aren't available for this,
>we'll have to use a non-perfect one.
>>
>> E.G. I want node X inserted at 10:24:33, how do I know if the prior
>> node is at 10:24:32, or 10:14:32?
>>
>
>With a 60 buckets in the hash-array, you wouldn't be able to tell the
>difference without comparing the timestamp of the other scheduled items
>in the list. With more than 600 buckets, they'd end up in separate lists.
>
>However, we won't ever have to walk through *all* the lists, we just have
>to find the right "end" of things.
>
>Assume for a moment that the highest normal_check_interval you have for
>your services is 15 minutes. 1024 buckets will hold your 15 minutes, with
>2.067 minutes to spare. Now let's say you want to schedule a service-check
>x seconds into the future.
>
>slot = time(NULL) + x;
>if (x > num_buckets) {
> walk_from_tail(event_array[slot]);
>}
>else {
> walk_from_head(event_array[slot]);
>}

Ah ok your hash function is the timestamp. So all items scheduled for
time X are in the same bucket if the number of buckets exceeds the
number of seconds from now till the maximum future time you have in
the list.

This works because the hash function preserves the order of the keys
rather than randonly distributing them like a normal has table. I.E.

key1 = timestamp1
key2 = timestamp2

and

hash(key1) > hash(key2) if key1 > key2

so walking the buckets is the same as skipping across the linked list
in order.

>Since we're not likely to schedule checks several hours away,

I have a few checks that run every couple of hours, so I would just
have a larger hash table, or would you need to have a seperate bin
that gets the entries that aren't schedulable in the normal hash
table, and it filters them in on a regular basis.

>and we don't
>have to walk *past* all items with the same scheduled time as our own event
>(since the list is going to be sorted based on a non-unique index), the most
>common case will be that we do only two operations to find our slot.

Since we can add the new item into the head/tail of the same bucket.

>>> In particular, I'm worried

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]