Page 1 of 2
Scheduling downtime
Posted: Thu May 14, 2015 5:30 pm
by bbailey6
Hi,
I'm still a little new to Nagios but I've been tasked with scheduling downtime for the week of a few of our servers. Our Nagios installation is still pretty new to everyone so I thought I would seek a best practice here.
We have 7 VMs that are going down for maintenance. I've examined a few of them and none of them have parents yet. I was thinking one way to do it would be to assign them all a parent (the same physical server) and schedule downtime for the physical server and then triggered downtime for the children (VMs). I'm not sure that is the best way though, seems like just as much work as scheduling downtime for all the hosts and all their services.
Can I schedule downtime for the hosts and will their services go down as well or do I have to schedule downtime for all the hosts and all their respective services?
I'm not really sure how to handle VMs as they could fall in any different number of groups and I was wondering what others have done.
Thanks!
Re: Scheduling downtime
Posted: Thu May 14, 2015 6:42 pm
by Box293
Here's a good article on Parents:
http://nagios.sourceforge.net/docs/3_0/ ... ility.html
One of the key parts of this solution is the time it takes for the HOST object to transition from a SOFT warning/critical/unknown (wcu) state to a HARD wcu state. This is all determined by the max_check_attempts and retry_interval directives.
If you have services that have smaller values for the max_check_attempts and retry_interval directives, then these services will go HARD wcu before the host object does and hence alerts will be sent.
bbailey6 wrote:Can I schedule downtime for the hosts and will their services go down as well or do I have to schedule downtime for all the hosts and all their respective services?
Yes and no. The exact same thing can happen as I've explained above, the services can go HARD wcu before the host object does. It is just easiest to schedule the services for downtime in one go as well.
So there are some tricks to make this a little quicker.
Add all the hosts to a hostgroup & Apply Config
Home > Details > Hostgroup Summary
For the group you just created, click the second icon "View Hostgroup Commands"
This will give you a screen with two handy links:
Schedule downtime for all hosts in this hostgroup
Schedule downtime for all services in this hostgroup
How does this work for you? Does everthing I've explained make sense?
Re: Scheduling downtime
Posted: Thu May 14, 2015 6:53 pm
by bbailey6
Hi,,
Yes I think so. I need to check our setup and see what I can do.
Thanks Box. Can I get back with you on this tomorrow?
Re: Scheduling downtime
Posted: Thu May 14, 2015 6:54 pm
by Box293
For sure, let us know how it works for you.
Re: Scheduling downtime
Posted: Fri May 15, 2015 1:47 pm
by bbailey6
Hi Box,
Yeah I think I am in good shape here. Been reading up on parent child relationships and its a lot to take in but pretty cool.
On a side note, for scheduling downtime in Nagios, is it expressed in 24 hours format (military time?)
I am trying to set an 8 hour outage from 11am to 7pm. Would that be 11:00:00 to 19:00:00?
oh also, one of the hosts has 2 services checks associated with it. I think I can schedule downtime on the host and set the 2 services to be triggered downtime? I need to look into how that works a bit more but I thought I would throw it out there
thanks
Re: Scheduling downtime
Posted: Fri May 15, 2015 2:04 pm
by ssax
Correct, it is military time and you have the correct military conversion for 11am to 7pm.
Example
Start: 05-16-2015 11:00:00
End: 05-16-2015 19:00:00
You could set triggered downtime for the services as well, there is also another way:
If you make sure that your host check is checking more frequently then your service checks you would only need to schedule downtime on the hosts/hostgroups.
- The host would enter downtime and the services would not notify because the host is in downtime.
Re: Scheduling downtime
Posted: Fri May 15, 2015 2:11 pm
by bbailey6
hmmm
so my host is check interval 3min, retry 1min, and max check attempt is 3min. the service is setup with the exact same numbers. are you saying that if I changed the service to say, 4, 1, 4 respectively, I would only have to schedule downtime for the host?
Re: Scheduling downtime
Posted: Fri May 15, 2015 2:19 pm
by jdalrymple
The best description is in the Nagios Core schedule downtime page:
"Command Description
This command is used to schedule downtime for all services on a particular host. During the specified downtime, Nagios will not send notifications out about the host. Normally, a host in downtime will not send alerts about any services in a failed state. This option will explicitly set downtime for all services for this host. When the scheduled downtime expires, Nagios will send out notifications for this host as it normally would. Scheduled downtimes are preserved across program shutdowns and restarts. Both the start and end times should be specified in the following format: mm/dd/yyyy hh:mm:ss. If you select the fixed option, the downtime will be in effect between the start and end times you specify. If you do not select the fixed option, Nagios will treat this as "flexible" downtime. Flexible downtime starts when the host goes down or becomes unreachable (sometime between the start and end times you specified) and lasts as long as the duration of time you enter. The duration fields do not apply for fixed downtime. "
Re: Scheduling downtime
Posted: Fri May 15, 2015 2:27 pm
by bbailey6
ah makes a lot of sense.
well damn, I went to commit a scheduled downtime and its saying "Sorry, but you are not authorized to commit the specified command."
I checked my user account and I'm an admin with all the boxes checked.
Googling "Sorry, but you are not authorized to commit the specified command." returns 10 year old posts :/
hmm
ah think I might have found something:
http://tracker.nagios.org/view.php?id=387
this still an issue?
Re: Scheduling downtime
Posted: Fri May 15, 2015 2:28 pm
by jdalrymple
It's a 10 year old problem.
Can you try logged in as nagiosadmin and see if you get the same results?