Next Scheduled Check over the Time Period

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
antoniosc
Posts: 6
Joined: Tue May 28, 2019 4:25 am

Next Scheduled Check over the Time Period

Post by antoniosc »

Hi,

we're migrating our old Nagios Version 3.0.3 to Nagios Core Version 4.4.3.
While our parallel run, in order to check it, I see a strange behavior for some alarms scheduled for running in a limited time period.

Example:
timeperiod: 04:00-04:05 (from sunday to saturday)
max_check_attempts 5
check_interval 20
retry_interval 3
notification_interval 30

Please correct me if i'm wrong. While scheduling the "Next scheduled check" data, Nagios consider if the date to be set is included in a valid timeperiod:
if YES -> set the "Next scheduled check" within this interval
if NO -> set the "Next scheduled check" as the first valid date during the next timeperiod (in this example will be next 04:00)

Our "old Nagios" is working in this way so, no problem.
With the new Nagios Core 4.4.3 version seems not working properly so I can see a scenario like this:
Next Scheduled Check: the day after at 04:13
Last Check Time: the last time the check was perfomed during the valid timeperiod (days before)

So it seems that Nagios try to perform the "Next check" as scheduled but, checking the timeperiod, doesn't run the check because that scheduled date is not included in the timeperiod.

How can i check? How can I "force" it?

To complete my scenario, I've migrated all the alarms configurations from the old to the new host so infos like time periods, intervals, check to run are totally the same.

Here a real example:
Current Status:   OK   (for 0d 1h 13m 3s+)
Status Information: OK, it works
Performance Data:  
Current Attempt: 1/3  (HARD state)
Last Check Time: 04-30-2019 14:15:55
Check Type: ACTIVE
Check Latency / Duration: 0.000 / 14.129 seconds
Next Scheduled Check:   05-29-2019 11:08:45

Service configuration:
check_interval 20
notification_interval 30
retry_interval 3
max_check_attempts 3

Timeperiod
monday 11:00-11:01
tuesday 11:00-11:01
wednesday 11:00-11:01
thursday 11:00-11:01
friday 11:00-11:01
saturday 11:00-11:01
sunday 11:00-11:01


EDIT: I've found this in the official documentation, but It's not what i'm facing
Specifying a timeperiod in the check_period directive allows you to restrict the time that Nagios Core perform regularly scheduled, active checks of the host or service. When Nagios Core attempts to reschedule a host or service check, it will make sure that the next check falls within a valid time range within the defined timeperiod. If it doesn't, Nagios Core will adjust the next check time to coincide with the next "valid" time in the specified timeperiod. This means that the host or service may not get checked again for another hour, day, or week, etc.
https://assets.nagios.com/downloads/nag ... riods.html


Thanks!
swolf

Re: Next Scheduled Check over the Time Period

Post by swolf »

Hi @antoniosc,

I'm setting up a configuration to see whether I can reproduce this -- will give you an update in a couple days.
swolf

Re: Next Scheduled Check over the Time Period

Post by swolf »

Okay, I tried this on my development machine and was able to reproduce the problem. I've opened an issue here, hopefully we'll be able to get this fixed shortly.
antoniosc
Posts: 6
Joined: Tue May 28, 2019 4:25 am

Re: Next Scheduled Check over the Time Period

Post by antoniosc »

swolf wrote:Okay, I tried this on my development machine and was able to reproduce the problem. I've opened an issue here, hopefully we'll be able to get this fixed shortly.
Ok, thanks for your fast feedback!
I'll wait for updates
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Next Scheduled Check over the Time Period

Post by benjaminsmith »

Hi @antoniosc,
Ok, thanks for your fast feedback!
I'll wait for updates
Thanks for pointing this out. For new updates, you can follow the issue on GitHub.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
antoniosc
Posts: 6
Joined: Tue May 28, 2019 4:25 am

Re: Next Scheduled Check over the Time Period

Post by antoniosc »

Hi,

in meanwhile, could you suggest us a workaround to apply?

Thanks!
swolf

Re: Next Scheduled Check over the Time Period

Post by swolf »

Hi @antoniosc

I think you have two options here:
1.) Make the timeperiods as large as the check_interval

For the example you posted, this would mean setting your timeperiod to 11:00-11:20 each day, and keeping the service definition the same.

2.) If the check needs to run exactly at 11 AM, then this probably shouldn't be an active check. Instead, set a daily cron job to run the check and submit a passive check result to nagios.

Let us know if you need any further assistance.
antoniosc
Posts: 6
Joined: Tue May 28, 2019 4:25 am

Re: Next Scheduled Check over the Time Period

Post by antoniosc »

Hi,

thanks!

About solution 1)
Will the issue occur again? I mean, i can schedule:
check_period 11:00 - 11:20
check_interval 20

The first check will run @11:00, the next check will be scheduled @11:20 (and maybe it will still work). What about the next one? It will schedule it @11:40 and the issue is still there.

am I wrong?
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Next Scheduled Check over the Time Period

Post by cdienger »

Setting the timeperiod to 11:00-11:20 and the check interval to 20 minutes should cause it to run sometime between 11:00 and 11:20.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
antoniosc
Posts: 6
Joined: Tue May 28, 2019 4:25 am

Re: Next Scheduled Check over the Time Period

Post by antoniosc »

I wonder about update manually, with ad sh script for example, the Next Scheduled Check value. Where this information is stored and updated?
I mean, I would like to do with an automatic script what we can do manually on the GUI using "Re-schedule the next check of this service".

In meanwhile we're checking the update post on GitHub.
Locked