Sort of automated downtime?
Sort of automated downtime?
Ok, had an interesting questioned asked of me today.....
When taking down an environment the DBAs run a script that shuts down a few things, does this and does that. How plausible would it be to add to that script to connect to the NagiosXI server and automagically schedule a downtime for the host and all services on the host the script was initiated from? They then run another script to bring everything back up. That would be the tricky part, we wouldn't want to remove the downtime(reporting and other reasons) but end it and when starting it it would more or less me indefinite until we told it to end the downtime.
They used to have this ability in OEM and asked if it could be added somehow. Every time I think of a way to do it, I then think of a reason that won't work 30 seconds later, lol.
Go with ideas now.....I'll shoot them all down like a WWII pilot!
When taking down an environment the DBAs run a script that shuts down a few things, does this and does that. How plausible would it be to add to that script to connect to the NagiosXI server and automagically schedule a downtime for the host and all services on the host the script was initiated from? They then run another script to bring everything back up. That would be the tricky part, we wouldn't want to remove the downtime(reporting and other reasons) but end it and when starting it it would more or less me indefinite until we told it to end the downtime.
They used to have this ability in OEM and asked if it could be added somehow. Every time I think of a way to do it, I then think of a reason that won't work 30 seconds later, lol.
Go with ideas now.....I'll shoot them all down like a WWII pilot!
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
Re: Sort of automated downtime?
Hey Bandit,
This is pretty simple in fact. Check out my Nagios XI Downtime Framework http://exchange.nagios.org/directory/Pl ... rk/details
In the code you can find somewhere a line where I put the chooses hosts in downtime with NRDP. You could use something like this in your script. You just need a user with permissions and set the appropriate times and comment.
Hope it helps.
Grtz
This is pretty simple in fact. Check out my Nagios XI Downtime Framework http://exchange.nagios.org/directory/Pl ... rk/details
In the code you can find somewhere a line where I put the chooses hosts in downtime with NRDP. You could use something like this in your script. You just need a user with permissions and set the appropriate times and comment.
Code: Select all
# Set host in downtime
$URL = "http://'REPLACE WITH FQDN OF NAGIOS XI SERVER'/nrdp/?cmd=submitcmd&token='REPLACE WITH NRDP AUTHENTICATION TOKEN'&command=SCHEDULE_HOST_DOWNTIME;$server;$start;$end;1;0;0;Nagios XI Downtime Dummy User;$comment"
$request1 = [System.Net.WebRequest]::Create($url)
$response1 = $request1.GetResponse()
$response1.close()Grtz
Nagios XI 5.8.1
https://outsideit.net
https://outsideit.net
Re: Sort of automated downtime?
That's the issue, not knowing the exact outage window. Their script does write a "blackout" file to the / folder. I'm thinking of adding a check for that file existing and if it does and no downtime exists, then schedule a 6 minute downtime. This check will run every 5 minutes and if the file exists and a downtime is active, then extend it by 5 minutes. Not sure if you can modify an active downtime. I'll research that as time exists.
The other option I thought of was if the file exists schedule a 1 month downtime. Then whenever the file doesn't exist, modify the downtime to end. I pray that is possible, I can imagine the code in my head already if you can modify an active downtime.
The other option I thought of was if the file exists schedule a 1 month downtime. Then whenever the file doesn't exist, modify the downtime to end. I pray that is possible, I can imagine the code in my head already if you can modify an active downtime.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
Re: Sort of automated downtime?
Ah sry, misread somewhere. Not knowing when the downtime ends will be tougher to solve.
Have a nice weekend!
Have a nice weekend!
Nagios XI 5.8.1
https://outsideit.net
https://outsideit.net
Re: Sort of automated downtime?
Probably will end up needing some sort of flat file to store the state of the downtime between runs of the remote script as you mentioned. You can certainly delete downtime programmatically:
http://old.nagios.org/developerinfo/ext ... and_id=126
Just gotta find the id for it. Might need some nasty SQL to find that
http://old.nagios.org/developerinfo/ext ... and_id=126
Just gotta find the id for it. Might need some nasty SQL to find that
Former Nagios employee
Re: Sort of automated downtime?
Trevor, yeah, I know you can delete a downtime. However, that would screw up te information for the SLA report and anything else that relies on that data. Is there any way to modify a downtime to make it end "now"? Or change the end date/time to 1 minute in the future?tmcdonald wrote:Probably will end up needing some sort of flat file to store the state of the downtime between runs of the remote script as you mentioned. You can certainly delete downtime programmatically:
http://old.nagios.org/developerinfo/ext ... and_id=126
Just gotta find the id for it. Might need some nasty SQL to find that
Edit: I went and actually read your link, and more awesome vague wording(lol):
Code: Select all
If the downtime is currently in effect, the service will come out of scheduled downtime (as long as there are no other overlapping active downtime entries)2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
-
sreinhardt
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: Sort of automated downtime?
Much like your 5 minute check with 6 minutes of downtime, this is saying that if you attempt to remove a downtime, and it is the only downtime in effect at that time for that host or service, it will be properly removed from downtime. However if you have overlapping downtime, such as if you just ran the 5 minute check and rescheduled, so that you have the last minute or less of one downtime, and ~6 minutes of a new one from your check, it would NOT remove the host or service from downtime until only the ~6 minute DT was the only one in effect. Make more sense? Still not really what you were looking for though.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Re: Sort of automated downtime?
The wording "remove from downtime" is what is confusing. If it is the only downtime, how does it remove it, by ending the downtime early or by deleting the downtime? That is the distinction I am trying to understand. If it deletes the downtime then that means data would be wrong in the reports that excludes downtimes. However, if it somehow ends it early, then those reports will still have accurate data.sreinhardt wrote:Much like your 5 minute check with 6 minutes of downtime, this is saying that if you attempt to remove a downtime, and it is the only downtime in effect at that time for that host or service, it will be properly removed from downtime. However if you have overlapping downtime, such as if you just ran the 5 minute check and rescheduled, so that you have the last minute or less of one downtime, and ~6 minutes of a new one from your check, it would NOT remove the host or service from downtime until only the ~6 minute DT was the only one in effect. Make more sense? Still not really what you were looking for though.
Does that explain my confusion better?
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
Re: Sort of automated downtime?
If you delete the downtime. You will end it early though it is still in the logs so it will not effect reporting (other than it may potentially pad your "uptime" percentage a bit less than leaving the downtime to run for the duration.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Sort of automated downtime?
Andy, screencap this because I may only say it once a decade...abrist wrote:If you delete the downtime. You will end it early though it is still in the logs so it will not effect reporting (other than it may potentially pad your "uptime" percentage a bit less than leaving the downtime to run for the duration.
I love you, you are a god among men!
Technically it is only because you happened to be the one to reply, but you gotta take what you can get!
You can close this. I'll work on my automated downtime add/remove check and will add it to the exchange when completed. I'm excited to work on this one!
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github