Recurring Downtime and Host/Service Checks

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
jkinning
Posts: 747
Joined: Wed Oct 09, 2013 2:54 pm

Recurring Downtime and Host/Service Checks

Post by jkinning »

We have monthly maintenance windows for our various groups, Mainframe, Telecom, Network, Server Management which I am using the recurring downtime to schedule the downtime each month. Yesterday was Server Management maintenance day and I have it set for our window 0000-1810 to accommodate both non-prod and prod systems. Technically, the window is over at 1800 but I included a 10 minute buffer to try and prevent any erroneous pages sent out to the oncall person. It was brought to my attention that someone from Server Management logged into Nagios to check the status of the hosts and services, make sure everything was good before Nagios started sending out notifications and they said nothing was shown until after the 1810 time and then everything appeared under the Technical Overview view. If these hosts are on scheduled downtime they are still being monitored, at least they appear to be, just wondering if he should have seen all the hosts. He is an admin in Nagios to see and change all hosts and services? He wanted to validate everything was good and correct anything that Nagios was showing problems before the downtime expired to prevent notifications being sent to the oncall person.

Is there a better method to do this? Schedule recurring downtime and keep notifications delay 15 or 30 minutes later or should these hosts and services been visible in the Technical Overview during recurring downtime?
bwallace
Posts: 1145
Joined: Tue Nov 17, 2015 1:57 pm

Re: Recurring Downtime and Host/Service Checks

Post by bwallace »

Scheduling downtime will only suppress email notifications during the specified time period. However, the checks still run as usual and any alerts will continue to be displayed in the UI - just that a notification for such will not be sent. So yes, this admin should have been able to see all hosts, etc - business as usual. It is odd he didn't see anything until downtime had expired.

I was unable to reproduce this here on version 5.2.9. What XI version are you running? Is there anything peculiar in /var/log/httpd/error_log from around this time?
Be sure to check out the Knowledgebase for helpful articles and solutions!
jkinning
Posts: 747
Joined: Wed Oct 09, 2013 2:54 pm

Re: Recurring Downtime and Host/Service Checks

Post by jkinning »

I am running 5.2.9 on CentOS 6.8 and do see error messages so I am attaching log file.
You do not have the required permissions to view the files attached to this post.
bwallace
Posts: 1145
Joined: Tue Nov 17, 2015 1:57 pm

Re: Recurring Downtime and Host/Service Checks

Post by bwallace »

Thanks, but what client IP can we focus on in that log? What particular time frame?
Be sure to check out the Knowledgebase for helpful articles and solutions!
jkinning
Posts: 747
Joined: Wed Oct 09, 2013 2:54 pm

Re: Recurring Downtime and Host/Service Checks

Post by jkinning »

Time is from 6am to 6:10pm and any host in the Windows Prod group, yellowfin1p might be good.
bwallace
Posts: 1145
Joined: Tue Nov 17, 2015 1:57 pm

Re: Recurring Downtime and Host/Service Checks

Post by bwallace »

Those details all look fine so between that and the error log you posted, there are not any clues to the cause of this behavior, unfortunately.
Is this reproducible on your side?
I attempted as much here but everything worked as expected.
Be sure to check out the Knowledgebase for helpful articles and solutions!
jkinning
Posts: 747
Joined: Wed Oct 09, 2013 2:54 pm

Re: Recurring Downtime and Host/Service Checks

Post by jkinning »

I have a note to check it out more closely next month during the maintenance window again. I am not sure if it is just a fluke or what cause I've had this setup for a couple years now and no one has said anything until just now.

Good to hear that from an "expert" view things look alright.
bwallace
Posts: 1145
Joined: Tue Nov 17, 2015 1:57 pm

Re: Recurring Downtime and Host/Service Checks

Post by bwallace »

Thanks. Should this occur again, definitely run these commands while reproducing the issue, then provide the output:

Code: Select all

tail -f /var/log/httpd/error_log
tail -f /usr/local/nagiosxi/var/cmdsubsys.log
We can leave this thread open in the meantime...
Be sure to check out the Knowledgebase for helpful articles and solutions!
Locked