Recurring Maintenance not working

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
acentek
Posts: 123
Joined: Thu Jul 27, 2017 2:00 pm

Re: Recurring Maintenance not working

Post by acentek »

Yes that is the error we received in the Web UI.

As of 8:34AM today a recurring maintneance was created for Salt Master Server you can see that information here.

https://www.screencast.com/t/niNFFfLLGS

And yet you can see that we stopped the salt-master server on acentek-inframgmt

https://www.screencast.com/t/vDgR4w0PVfXH

Here you can see that the event log for acentek-inframgmt is not showing that it went into recurring maintenance.

https://www.screencast.com/t/eXW6pUADy

And now you can see that the Salt Master Server service went down hard and a notification was sent out to ops genie.

https://www.screencast.com/t/u9MYVocayE3I

Thoughts?
acentek
Posts: 123
Joined: Thu Jul 27, 2017 2:00 pm

Re: Recurring Maintenance not working

Post by acentek »

Where should i be looking to verify recurring downtimes are set correct outside of the UI? Is this stored in the mysql DB? Is that the reason i am getting php errors from httpd's error log?

[Mon Jul 16 08:54:51.881642 2018] [:error] [pid 25954] [client 192.168.61.68:4300] PHP Notice: Undefined variable: search_str in /usr/local/nagiosxi/html/includes/components/xicore/recurringdowntime.inc.php on line 196, referer: http://nagios.acentek.net/nagiosxi/
JGCG
Posts: 45
Joined: Fri Sep 29, 2017 6:31 am

Re: Recurring Maintenance not working

Post by JGCG »

Just to chime in hopefully I can help.
I had a similar issue last week where hosts/services in scheduled downtime were still alerting before the downtime ended.
It was all configured correctly, but even the logs were showing 'Downtime Started', then the alerts kicks in and sends notifications, 'Downtime Ended'.

Since I've done the below I've had no issues (unfortunately, I have to do it each time I amend/add a new recurring downtime).
Go to the 'Scheduled Downtime' section
Delete all the scheduled downtimes
Manually run the perl script as the Nagios user to set all the scheduled downtimes again: /usr/local/nagiosxi/cron/./recurringdowntime.pl
acentek
Posts: 123
Joined: Thu Jul 27, 2017 2:00 pm

Re: Recurring Maintenance not working

Post by acentek »

That perl script did it.

Type Date / Time Information
Service Critical 2018-07-16 12:04:21 SERVICE ALERT: acentek-inframgmt;Salt Master Server;CRITICAL;HARD;5;inactive
Service Critical 2018-07-16 12:03:32 SERVICE ALERT: acentek-inframgmt;Salt Master Server;CRITICAL;SOFT;4;inactive
Service Critical 2018-07-16 12:02:22 SERVICE ALERT: acentek-inframgmt;Salt Master Server;CRITICAL;SOFT;3;inactive
Service Critical 2018-07-16 12:01:22 SERVICE ALERT: acentek-inframgmt;Salt Master Server;CRITICAL;SOFT;2;inactive
Service Critical 2018-07-16 12:00:22 SERVICE ALERT: acentek-inframgmt;Salt Master Server;CRITICAL;SOFT;1;inactive
Scheduled Downtime Start 2018-07-16 11:54:59 SERVICE DOWNTIME ALERT: acentek-inframgmt;Salt Master Server;STARTED; Service has entered a period of scheduled downtime

It works now.

So i'm guessing over the last couple updates cron jobs have been wiped on the server?

Is there a script that build cron jobs for the nagios user?
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Recurring Maintenance not working

Post by npolovenko »

@acentek, Please upload the /usr/local/nagios/etc/recurringdowntime.cfg file so we can check whether the recurring downtime entries get written out successfully.

Also, let's check the permissions in the etc folder:

Code: Select all

ls -l /usr/local/nagios/etc/
You can also tail the cmdbusys while you're scheduling downtime and look in the console for errors:

Code: Select all

tail -f /usr/local/nagiosxi/var/cmdsubsys.log

Let's check the permissions on: /usr/local/nagios/etc/recurringdowntime.cfg
https://support.nagios.com/kb/article.php?id=61

And please upload:

Code: Select all

/usr/local/nagiosxi/var/recurringdowntime.log
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
acentek
Posts: 123
Joined: Thu Jul 27, 2017 2:00 pm

Re: Recurring Maintenance not working

Post by acentek »

https://drive.google.com/open?id=1vpQ2O ... OHpotjPjm7

https://drive.google.com/open?id=1VPHCw ... i6ox8Z0ENm

First one is the recurringdowntime.cfg

Second one is recurringdowntime.log

Here is the permissions on the /usr/loca/nagios/etc/ folder

[nagios@nagios cron]$ ls -l /usr/local/nagios/etc/
total 552
-rw-rw-r-- 1 apache nagios 3153 Apr 23 12:30 cgi.cfg
-rw-rw-r-- 1 apache nagios 3090 Feb 2 11:42 cgi.cfg_20180202-114234
-rw-rw-r-- 1 apache nagios 3090 Feb 2 12:04 cgi.cfg_20180202-120347
drwxrwxr-x 4 apache nagios 34 Feb 2 12:58 cisco
-rw-rw-r-- 1 apache nagios 36001 Jul 16 14:49 commands.cfg
-rw-rw-r-- 1 apache nagios 1039 Jul 16 14:49 contactgroups.cfg
-rw-rw-r-- 1 apache nagios 16955 Jul 16 14:49 contacts.cfg
-rw-rw-r-- 1 apache nagios 1722 Jul 16 14:49 contacttemplates.cfg
-rw-rw-r-- 1 apache nagios 796 Jul 16 14:49 hostdependencies.cfg
-rw-rw-r-- 1 apache nagios 794 Jul 16 14:49 hostescalations.cfg
-rw-rw-r-- 1 apache nagios 786 Jul 16 14:49 hostextinfo.cfg
-rw-rw-r-- 1 apache nagios 97236 Jul 16 14:49 hostgroups.cfg
drwxrwxr-x. 2 apache nagios 65536 Jul 13 15:21 hosts
-rw-rw-r-- 1 apache nagios 18321 Jul 16 14:49 hosttemplates.cfg
drwxrwxr-x. 2 apache nagios 6 Jul 13 16:04 import
-rw-rw-r-- 1 apache nagios 5866 Jul 13 16:04 nagios.cfg
-rw-rw-r-- 1 apache nagios 5778 Feb 2 11:42 nagios.cfg_20180202-114234
-rw-rw-r-- 1 apache nagios 5822 Feb 2 12:04 nagios.cfg_20180202-120347
-rw-rw-r-- 1 apache nagios 2229 Nov 14 2016 ndo2db.cfg
-rw-rw-r-- 1 apache nagios 4827 Nov 14 2016 ndomod.cfg
-rw-rw-r-- 1 apache nagios 12463 Jul 2 12:21 nrpe.cfg
-rw-rw-r-- 1 apache nagios 7988 Nov 14 2016 nrpe.cfg.old
-rw-rw-r-- 1 apache nagios 5346 Mar 9 15:49 nsca.cfg
drwxrwxr-x. 4 apache nagios 4096 Jul 9 16:17 pnp
-rw-rw-r-- 1 apache nagios 15572 Jul 16 11:50 recurringdowntime.cfg
-rw-rw-r-- 1 apache nagios 277 Jun 11 15:16 resource.cfg
-rw-rw-r-- 1 apache nagios 1627 Nov 14 2016 send_nsca.cfg
-rw-rw-r-- 1 apache nagios 1183 Jul 16 14:49 servicedependencies.cfg
-rw-rw-r-- 1 apache nagios 800 Jul 16 14:49 serviceescalations.cfg
-rw-rw-r-- 1 apache nagios 792 Jul 16 14:49 serviceextinfo.cfg
-rw-rw-r-- 1 apache nagios 5009 Jul 16 14:49 servicegroups.cfg
drwxrwxr-x 2 apache nagios 61440 Jul 16 14:49 services
-rw-rw-r-- 1 apache nagios 25287 Jul 16 14:49 servicetemplates.cfg
drwxrwxr-x. 2 apache nagios 65 Jul 9 16:17 static
-rw-rw-r-- 1 apache nagios 14155 Jul 16 14:49 timeperiods.cfg



Like i requested should i see something in crontab of the nagios user that calls to create recurring downtimes weekly?

Let me know if there is anything i can change.
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Recurring Maintenance not working

Post by npolovenko »

@acentek, The cron is running because I see entries in recurringdowntime.log. In XI GUI please go to the Admin menu, then click on System Profile in the left column and click on View System Info. Look for Date/Time paragraph and let me know whether PHP time and System Time match exactly and whether the time is accurate.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
acentek
Posts: 123
Joined: Thu Jul 27, 2017 2:00 pm

Re: Recurring Maintenance not working

Post by acentek »

PHP Time: Mon, 16 Jul 2018 16:01:33 -0500
System Time: Mon, 16 Jul 2018 16:01:33 -0500

Those two values are matching.

Next suggestion?
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Recurring Maintenance not working

Post by npolovenko »

@acentek, Please let me know which service or host you scheduled to be in downtime and then send me your system profile so I can review it. To send us your system profile:
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and upload it to a cloud storage of your choice. You can share a link with me in a personal message.
After you upload the profile please post something in this thread to bring it up in the support queue.


Profile was received and shared with the support team.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
acentek
Posts: 123
Joined: Thu Jul 27, 2017 2:00 pm

Re: Recurring Maintenance not working

Post by acentek »

The host i was playing with on friday and this morning was.

acentek-inframgmt

The Service is called "Salt Master Server"

Like i said after i ran the recurringmaintenance.pl script it started working as suggested by someone else. so after 11:50AM CST it worked.
Locked