Auto Rescheduling Logic Confusion

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Post Reply
blevans
Posts: 13
Joined: Mon Mar 23, 2015 3:40 pm

Auto Rescheduling Logic Confusion

Post by blevans »

I am trying to understand how the auto rescheduling logic is supposed to work.
The recommended values in nagios.cfg are:

Code: Select all

auto_reschedule_checks=1
auto_rescheduling_interval=30
auto_rescheduling_window=45
Supposedly this is after they realized the default values (same as above but with window=180) had the potential to have checks that never actually get executed.
How are these new values any different?
How I understand it, every 30sec, it takes the next 45sec of checks and reschedules them in a more even manner.
Isnt it possible for a check to indefinitely get rescheduled into that last 15 seconds of the 45sec window?
Am I missing something? Do rescheduled checks get an elevated priority so they aren't rescheduled again?

Seems to me that the only way to prevent that is for the interval and window to be the same.
Which means maybe a couple checks occurring right at each 30sec transition may go unnoticed,
but Id rather that than what I am presuming the behavior is with the "recommended" interval=30sec , window=45sec.

Had trouble understanding how the rescheduling is working by looking at the code..
Any insights would be helpful!
Last edited by blevans on Wed Jan 18, 2023 9:44 am, edited 1 time in total.
blevans
Posts: 13
Joined: Mon Mar 23, 2015 3:40 pm

Re: Auto Rescheduling Logic Confusion

Post by blevans »

Here's where they indicate that the values should be changed to 30/45 from 30/180:
https://nagios.force.com/support/s/arti ... g-4f7efc76

But the default values in nagios.cfg with Core v4.4.6 are still the old ("buggy") values (30/180)!

Can anyone from the Nagios team share some insight on whether this feature is safe to use??
Arielwilson1
Posts: 1
Joined: Thu Sep 07, 2023 4:48 am

Re: Auto Rescheduling Logic Confusion

Post by Arielwilson1 »

You're right, aligning the interval and window at 30 seconds seems like a safer approach to avoid unwanted behaviors.
evelynwresker
Posts: 1
Joined: Tue Oct 03, 2023 7:04 am

Re: Auto Rescheduling Logic Confusion

Post by evelynwresker »

It's great that you're delving into the details of the auto rescheduling logic in Nagios. Understanding how these settings work can help fine-tune your monitoring system.

You're correct in your interpretation of how the new values function. The settings you mentioned essentially aim to evenly distribute checks within the specified window. However, you've also raised a valid concern about checks potentially getting repeatedly rescheduled into the last 15 seconds of the 45-second window.

To answer your question, Nagios does not automatically assign elevated priority to rescheduled checks to prevent them from being rescheduled again. Therefore, your observation is accurate: without elevated priority, there is a possibility that a check may keep getting rescheduled and possibly end up in the same time slot within the window.

Your suggestion of having both interval and window values the same is a reasonable approach if you want to avoid this issue. It might lead to a few checks occurring right at each 30-second transition, but as you mentioned, it's often preferred to have this predictability rather than dealing with the complexities of rescheduling.

If you have concerns about the behavior of rescheduled checks, and if the predictability of check timing is crucial for your monitoring system, aligning the interval and window values could be a practical solution. It's all about finding the balance that works best for your specific monitoring needs.
Post Reply