Re: [Nagios-devel] FW: Problem with initial service scheduling (2.0b3)

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] FW: Problem with initial service scheduling (2.0b3)

Post by Guest »

François Laupretre wrote:
> Sorry for posting this message again but I cannot modify my production
> environment before having an opinion from somebody who understands the
> 'interleave_block' stuff.
>

I think Ethan's the only one who really does.

Here's the doco for it though.
http://nagios.sourceforge.net/docs/2_0/ ... terleaving

As for the algorithm, I believe
max_service_check_spread * (total_active_services /
total_scheduled_services)
is more proper.


> Thanks in advance
>
>
>>-----Original Message-----
>>From: Laupretre, François (CALYON)
>>Sent: Thursday, June 09, 2005 2:56 PM
>>To: [email protected]
>>Subject: Problem with initial service scheduling (2.0b3)
>>
>>
>>Hi all,
>>
>>I currently have a configuration with 4800 services : 600
>>active and 4200 passive. And, as the number was growing, I
>>noticed a problem in the way nagios scheduled their initial
>>check time : With the 2.0b3 original code, with
>>max_service_check_spread=30, when I look at the scheduling
>>queue just after start, I see that the last service checks
>>are scheduled to run in 4 hours !
>>
>>This delay corresponds to :
>>
>>max_service_check_spread * (total_services / total_scheduled_services)
>>
>>And should be equal to max_service_check_spread.
>>
>>I found the reason in event.c/init_timing_loop() and I am
>>including a change which appears to correct the problem but,
>>as I am not sure to fully understand the 'interleave_block'
>>logic, this change should be taken with care :
>>
>>The reason : in the 'schedule service checks' section of
>>init_timing_loop(), next check time is incremented for each
>>service, and not for each SCHEDULED service. So, in my case
>>it is incremented 'total_services' times and the last check
>>time is equal to :
>>
>>Current_time + total_services * service_inter_check_delay
>>
>>Where it should be :
>>
>>Current_time + total_scheduled_services * service_inter_check_delay
>>
>>Which is coherent with the way service_inter_check_delay is computed.
>>
>>My change consists of taking the 'should_be_scheduled' check
>>out of the inner loop, and add a line in order to have the
>>code enter the inner 'interleave_block' loop only for active
>>checks. This way current_interleave_block goes from 0 to
>>total_schedules_services instead of going up to total_services.
>>
>>Once again, the patch I am submitting seems to correct the
>>problem in MY case. But I don't know if it is correct when
>>interleave variables have some different values.
>>
>>Regards
>>
>>François
>>
>
>
>
> ------------------------------------------------------------------------
>
> Ce message et ses pièces jointes (le "message") est destiné à l'usage
> exclusif de son destinataire.
> Si vous recevez ce message par erreur, merci d'en aviser immédiatement
> l'expéditeur et de le détruire ensuite. Le présent message pouvant
> être altéré à notre insu, CALYON Corporate and Investment Bank
> ne peut pas être engagé par son contenu. Tous droits réservés.
>
> This message and/or any attachments (the "message") is intended for
> the sole use of its addressee.
> If you are not the addressee, please immediately notify the sender and
> then destroy the message. As this message and/or any attachments may
> have been altered without our knowledge, its content is not legally
> binding on CALYON Corporate and Investment Bank. All rights reserved.

--
Andreas Ericsson [email protected]
OP5 AB www.op5.se
Lead Developer





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked