Scheduled Downtime Stops Functioning After Restart

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
User avatar
CGraham
Posts: 115
Joined: Tue Aug 16, 2011 2:43 pm

Scheduled Downtime Stops Functioning After Restart

Post by CGraham »

When someone schedules downtime it functions fine until someone else applies changes in the configuration manager. When the changes are applied, notifications begin sending despite the scheduled downtime. On the host status detail screen, the "half-pie" scheduled downtime icon disappears but the scheduled downtime comment remain. Additionally, if you go into the scheduled downtime link on the left, the downtime is still listed there.

UPDATE: I tried just restarting Nagios instead of using the "Apply Configuration" button. This also causes the scheduled downtime icon to disappear and notifications to begin sending.

Attempted solutions: Read through the FAQ and tried the "killall nagios" & "service nagios start" to no avail.

Here's the messages log of the restart:

Jun 18 13:30:34 [server name redacted] nagios: PROGRAM_RESTART event encountered, restarting...
Jun 18 13:30:34 [server name redacted] nagios: ndomod: Shutdown complete.
Jun 18 13:30:34 [server name redacted] nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
Jun 18 13:30:34 [server name redacted] nagios: Nagios 3.4.1 starting... (PID=762)
Jun 18 13:30:34 [server name redacted] nagios: Local time is Mon Jun 18 13:30:34 EDT 2012
Jun 18 13:30:34 [server name redacted] nagios: LOG VERSION: 2.0
Jun 18 13:30:34 [server name redacted] nagios: ndomod: NDOMOD 1.5.1 (05-15-2012) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Jun 18 13:30:34 [server name redacted] nagios: ndomod: Successfully connected to data sink. 0 queued items to flush.
Jun 18 13:30:34 [server name redacted] nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.

System Profile Uploaded
You do not have the required permissions to view the files attached to this post.
agriffin
Posts: 876
Joined: Mon May 09, 2011 9:36 am

Re: Scheduled Downtime Stops Functioning After Restart

Post by agriffin »

This information is normally written to /usr/local/nagios/var/retention.dat, which survives restarts to nagios. The first things that come to mind are that this could be caused by changes to nagios.cfg or permission issues with retention.dat. Have you modified Nagios' main config file? What's the output of the following command?

Code: Select all

# ls -l /usr/local/nagios/var/retention.dat
User avatar
CGraham
Posts: 115
Joined: Tue Aug 16, 2011 2:43 pm

Re: Scheduled Downtime Stops Functioning After Restart

Post by CGraham »

Yes I have edited the nagios.cfg file (extending the check timeout) but not the entry regarding state retention:

state_retention_file=/usr/local/nagios/var/retention.dat


[root@hostname libexec]# ls -l /usr/local/nagios/var/retention.dat
-rw------- 1 nagios users 6006132 Jun 19 14:11 /usr/local/nagios/var/retention.dat
User avatar
CGraham
Posts: 115
Joined: Tue Aug 16, 2011 2:43 pm

Re: Scheduled Downtime Stops Functioning After Restart

Post by CGraham »

Just created scheduled downtime while watching the file. The entry wasn't created. Here's the beginning if that helps:

########################################
# NAGIOS STATE RETENTION FILE
#
# THIS FILE IS AUTOMATICALLY GENERATED
# BY NAGIOS. DO NOT MODIFY THIS FILE!
########################################
info {
created=1340129481
version=3.4.1
last_update_check=1338826943
update_available=1
update_uid=1314624658
last_version=3.2.3
new_version=3.4.1
}
program {
modified_host_attributes=3
modified_service_attributes=3
enable_notifications=1
active_service_checks_enabled=1
passive_service_checks_enabled=1
active_host_checks_enabled=1
passive_host_checks_enabled=1
enable_event_handlers=1
obsess_over_services=0
obsess_over_hosts=0
check_service_freshness=1
check_host_freshness=0
enable_flap_detection=1
enable_failure_prediction=1
process_performance_data=1
global_host_event_handler=xi_host_event_handler
global_service_event_handler=xi_service_event_handler
next_comment_id=364
next_downtime_id=47
next_event_id=25490
next_problem_id=12509
next_notification_id=1047
}
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Scheduled Downtime Stops Functioning After Restart

Post by scottwilkerson »

The scheduled downtime script only runs 1 time per hour via cron, so you won't see the change until a couple minutes past the hour.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
User avatar
CGraham
Posts: 115
Joined: Tue Aug 16, 2011 2:43 pm

Re: Scheduled Downtime Stops Functioning After Restart

Post by CGraham »

Ok, so what I'm understanding is that any scheduled downtime isn't permanent until the top of the hour.

Can you tell me which cron job does this? And impact of increasing the number of runs per hour?

Basically we are a growing software company that is adding systems to Nagios constantly. This is causing issues with false alarms...
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Scheduled Downtime Stops Functioning After Restart

Post by scottwilkerson »

This is true..

You certainly could modify the cron to run more often, it is found in /etc/cron.d/nagiosxi and is the line

Code: Select all

01  * * * * nagios /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
this isn't true, editing my post because I had thought we were talking about recurring downtime.
Last edited by scottwilkerson on Wed Jun 20, 2012 9:45 am, edited 1 time in total.
Reason: I thought we were talking about recurring downtime...
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
agriffin
Posts: 876
Joined: Mon May 09, 2011 9:36 am

Re: Scheduled Downtime Stops Functioning After Restart

Post by agriffin »

This is apparently a bug in the latest Nagios Core release. There's a fix posted on the bug report here. We are currently testing it and will likely ship a bug fix release soon.
User avatar
CGraham
Posts: 115
Joined: Tue Aug 16, 2011 2:43 pm

Re: Scheduled Downtime Stops Functioning After Restart

Post by CGraham »

Thanks for the information. I certainly didn't seem to remember losing my scheduled downtime every time I made changes.
Locked