Possible reoccurring downtime bug

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
nickap
Posts: 26
Joined: Wed Jun 26, 2019 9:43 am

Possible reoccurring downtime bug

Post by nickap »

We have reoccurring downtime scheduled to trigger every Saturday and Sunday
This service has been scheduled for fixed downtime from 10-05-2019 12:00:00 to 10-07-2019 12:00:00. Notifications for the service will not be sent out during that time period.

This service has been scheduled for fixed downtime from 09-29-2019 12:00:00 to 10-01-2019 12:00:00. Notifications for the service will not be sent out during that time period.
This month had 5 weekends and the scheduled downtime triggered on Sunday and Monday. Is this is a bug or am I missing something?
You do not have the required permissions to view the files attached to this post.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Possible reoccurring downtime bug

Post by benjaminsmith »

Hello @nickap
This month had 5 weekends and the scheduled downtime triggered on Sunday and Monday. Is this is a bug or am I missing something?
Let's check all the time settings on the server to make sure there isn't a mismatch that could be causing a scheduling error. Please post the output to the following commands.

The php time and the server time.

Code: Select all

php -r 'echo date("D M j G:i:s T Y")."\n";' 
date 
Also, check the timezone for both php and the server.

Code: Select all

grep "date.timezone" /etc/php.ini
ls -l /etc/localtime
Lastly, the time settings for the database.

Code: Select all

echo "SELECT NOW();" | mysql -u root -pnagiosxi
Reference: Nagios XI Changing The System Time
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
nickap
Posts: 26
Joined: Wed Jun 26, 2019 9:43 am

Re: Possible reoccurring downtime bug

Post by nickap »

[root@nagiosxi ~]# php -r 'echo date("D M j G:i:s T Y")."\n";'
Mon Sep 30 17:19:57 EDT 2019
[root@nagiosxi ~]# date
Mon Sep 30 17:20:01 EDT 2019
[root@nagiosxi ~]#
[root@nagiosxi ~]# grep "date.timezone" /etc/php.ini
; http://www.php.net/manual/en/datetime.c ... e.timezone
date.timezone = US/Eastern
[root@nagiosxi~]# ls -l /etc/localtime
lrwxrwxrwx 1 root root 30 Sep 24 2015 /etc/localtime -> /usr/share/zoneinfo/US/Eastern
[root@nagiosxi ~]#
[root@nagiosxi ~]# echo "SELECT NOW();" | mysql -u root -pnagiosxi
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: ...)
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Possible reoccurring downtime bug

Post by benjaminsmith »

Hello @nickap,

Thanks for running those commands, that looks normal. Please send me your system profile so I can review the logs.

To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message and then reply to this post to bring it up in the queue.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
nickap
Posts: 26
Joined: Wed Jun 26, 2019 9:43 am

Re: Possible reoccurring downtime bug

Post by nickap »

benjaminsmith wrote:Hello @nickap,

Thanks for running those commands, that looks normal. Please send me your system profile so I can review the logs.

To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message and then reply to this post to bring it up in the queue.


Sent profile.zip via PM, thanks Ben!
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Possible reoccurring downtime bug

Post by benjaminsmith »

Hi,

Thanks for sending over the system profile. Recurring downtime is running a cron job that sends a command to Nagios to initiate scheduled downtime. The settings are being written to /usr/local/nagios/etc/recurringdowntime.cfg, and it is working but not at the correct times.

Regarding the settings, you have selected downtime to start at 12:00 on Saturday and Sunday for a period of 48 hours, so downtime on Sunday starts when the host or service is still in downtime. When do you want the scheduled downtime to end? Please try to set it to start on Saturday for 48 hours or to start on Saturday and Sunday for a duration of 24 hours.

Please post the recurring downtime log for any error messages. Also, try setting up a test recurring downtime schedule and then check the logs to make sure it was started at the correct time.

Code: Select all

tail /usr/local/nagiosxi/var/recurringdowntime.log
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
nickap
Posts: 26
Joined: Wed Jun 26, 2019 9:43 am

Re: Possible reoccurring downtime bug

Post by nickap »

benjaminsmith wrote:Hi,

Thanks for sending over the system profile. Recurring downtime is running a cron job that sends a command to Nagios to initiate scheduled downtime. The settings are being written to /usr/local/nagios/etc/recurringdowntime.cfg, and it is working but not at the correct times.

Regarding the settings, you have selected downtime to start at 12:00 on Saturday and Sunday for a period of 48 hours, so downtime on Sunday starts when the host or service is still in downtime. When do you want the scheduled downtime to end? Please try to set it to start on Saturday for 48 hours or to start on Saturday and Sunday for a duration of 24 hours.

Please post the recurring downtime log for any error messages. Also, try setting up a test recurring downtime schedule and then check the logs to make sure it was started at the correct time.

Code: Select all

tail /usr/local/nagiosxi/var/recurringdowntime.log
Thanks Ben, I've removed Sunday from the scheduled downtime and changed the schedule to Saturday for 48 hours. I will post the log on Monday after the schedule kicks off.

On a side note, should the schedule downtime be capped if at 1440 minutes (24 hours) if more than one day is selected? or at least a description indicating it will overlap if greater than 24 hours.
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: Possible reoccurring downtime bug

Post by mbellerue »

nickap wrote:On a side note, should the schedule downtime be capped if at 1440 minutes (24 hours) if more than one day is selected? or at least a description indicating it will overlap if greater than 24 hours.
I will see if we an add a bit of text in the scheduled downtime section about this. Restricting the time to 24 hours might be a good idea, too. Let's see how your server does with the scheduled downtime and from there we can look at a feature request.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
nickap
Posts: 26
Joined: Wed Jun 26, 2019 9:43 am

Re: Possible reoccurring downtime bug

Post by nickap »

Code: Select all

tail /usr/local/nagiosxi/var/recurringdowntime.log
check successful
candidate_timestamp: 1570896000,2019-10-12 12:00
got candidate_day_of_week: sat, checking: sat
check successful
got candidate_month_of_year: oct, checking: jan,feb,mar,apr,may,jun,jul,aug,sep,oct,nov,dec
check successful
all parameters match, re-adjusting candidate for proper time
candidate_timestamp: 1570896000000, 2019-10-12 12:00
Downtime exists with start_time: 1570896000, and duration 172800 seconds ..
NOT SCHEDULING
I've pasted the log from the weekend. I actually got an alert triggered at 1:00 AM Saturday for this, I think it should of been suppressed or something is not configured right.
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: Possible reoccurring downtime bug

Post by mbellerue »

Can we see more of that log? The specific section you have up right now is for 2019-10-12. I'd like to see basically those messages, except for 2019-10-05.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked