[Nagios-devel] Problem with initial service scheduling (2.0b3)

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

[Nagios-devel] Problem with initial service scheduling (2.0b3)

Post by Guest »

This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_000_01C56CF2.8BB066AA
Content-Type: multipart/alternative;
boundary="----_=_NextPart_001_01C56CF2.8BB066AA"

------_=_NextPart_001_01C56CF2.8BB066AA
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: Quoted-Printable

Hi all,

I currently have a configuration with 4800 services : 600 active and 4200
passive. And, as the number was growing, I noticed a problem in the way
nagios scheduled their initial check time : With the 2.0b3 original code,
with max_service_check_spread=3D30, when I look at the scheduling queue jus=
t
after start, I see that the last service checks are scheduled to run in 4
hours !

This delay corresponds to :

max_service_check_spread * (total_services / total_scheduled_services)

And should be equal to max_service_check_spread.

I found the reason in event.c/init_timing_loop() and I am including a chang=
e
which appears to correct the problem but, as I am not sure to fully
understand the 'interleave_block' logic, this change should be taken with
care :

The reason : in the 'schedule service checks' section of init_timing_loop()=
,
next check time is incremented for each service, and not for each SCHEDULED=

service. So, in my case it is incremented 'total_services' times and the
last check time is equal to :

Current_time + total_services * service_inter_check_delay

Where it should be :

Current_time + total_scheduled_services * service_inter_check_delay

Which is coherent with the way service_inter_check_delay is computed.

My change consists of taking the 'should_be_scheduled' check out of the
inner loop, and add a line in order to have the code enter the inner
'interleave_block' loop only for active checks. This way
current_interleave_block goes from 0 to total_schedules_services instead of=

going up to total_services.

Once again, the patch I am submitting seems to correct the problem in MY
case. But I don't know if it is correct when interleave variables have some=

different values.

Regards

Fran=E7ois


------_=_NextPart_001_01C56CF2.8BB066AA
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: Quoted-Printable






Problem with initial service scheduling (2.0b3)



Hi all,


I currently have a configuration with 4800 services : 600=
active and 4200 passive. And, as the number was growing, I noticed a probl=
em in the way nagios scheduled their initial check time : With the 2.0b3 or=
iginal code, with max_service_check_spread=3D30, when I look at the schedul=
ing queue just after start, I see that the last service checks are schedule=
d to run in 4 hours !

This delay corresponds to :


max_service_check_spread * (total_services / total_schedu=
led_services)


And should be equal to max_service_check_spread.


I found the reason in event.c/init_timing_loop() and I am=
including a change which appears to correct the problem but, as I am not s=
ure to fully understand the 'interleave_block' logic, this change should be=
taken with care :

The reason : in the 'schedule service checks' section of =
init_timing_loop(), next check time is incremented for each service, and no=
t for each SCHEDULED service. So, in my case it is incremented 'total_servi=
ces' times and the last check time is equal to :

Current_time + total_services * service_inter_check_delay=



Where it should be :


Current_time + total_scheduled_services * service_inter_c=
heck_delay


Wh

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked