Windows service restart using Nagios XI event handler
Windows service restart using Nagios XI event handler
Hi Team,
We want to use Nagios XI event handler feature to start the Windows services when they go down. We were able to achieve it using below document.
https://assets.nagios.com/downloads/nag ... ios-XI.pdf
But, we noticed that the services were getting restart during the downtime as well, we do not want the services to start during downtime.
We checked few topics in Nagios Forum related to the same, below are the ones we referred for the same:
https://support.nagios.com/forum/viewto ... 16&t=39864
https://support.nagios.com/forum/viewto ... =6&t=46993
And, we changed our script like below:
#!/bin/sh
#$1 defines service state, $2 defines hostname, $3 defines servicename to restart on windows server ,$4 defines servicestate, $5 defines hostdowntime, $6 defines servicedowntime
logfile=/usr/local/nagios/libexec/eventhandler.log
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
if [[ "$4" == "HARD" ]] && [[ "$5" == "0" ]] && [[ "$6" == "0" ]]
then
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c restart_service -a "$3" >> $logfile
echo ServiceState $4 >> $logfile
echo HostDownTime $5 >> $logfile
echo ServiceDownTime $6 >> $logfile
fi
;;
esac
exit 0
As per the posts, the above script will not start the Windows services when it is in downtime.
We used below command definition:
$USER1$/service_restart.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$ $SERVICEDOWNTIME$
I am not sure how to use all the arguments in command definition and it seems that is the reason why the script is not working for us.
Can you please help me to write the exact command definition?
Also, please let me know if I have missed anything that is preventing the script to run properly.
Thanks in advance!!
We want to use Nagios XI event handler feature to start the Windows services when they go down. We were able to achieve it using below document.
https://assets.nagios.com/downloads/nag ... ios-XI.pdf
But, we noticed that the services were getting restart during the downtime as well, we do not want the services to start during downtime.
We checked few topics in Nagios Forum related to the same, below are the ones we referred for the same:
https://support.nagios.com/forum/viewto ... 16&t=39864
https://support.nagios.com/forum/viewto ... =6&t=46993
And, we changed our script like below:
#!/bin/sh
#$1 defines service state, $2 defines hostname, $3 defines servicename to restart on windows server ,$4 defines servicestate, $5 defines hostdowntime, $6 defines servicedowntime
logfile=/usr/local/nagios/libexec/eventhandler.log
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
if [[ "$4" == "HARD" ]] && [[ "$5" == "0" ]] && [[ "$6" == "0" ]]
then
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c restart_service -a "$3" >> $logfile
echo ServiceState $4 >> $logfile
echo HostDownTime $5 >> $logfile
echo ServiceDownTime $6 >> $logfile
fi
;;
esac
exit 0
As per the posts, the above script will not start the Windows services when it is in downtime.
We used below command definition:
$USER1$/service_restart.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$ $SERVICEDOWNTIME$
I am not sure how to use all the arguments in command definition and it seems that is the reason why the script is not working for us.
Can you please help me to write the exact command definition?
Also, please let me know if I have missed anything that is preventing the script to run properly.
Thanks in advance!!
You do not have the required permissions to view the files attached to this post.
Thanks & Regards,
I2MP Team.
I2MP Team.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Windows service restart using Nagios XI event handler
Hello @RIDS_I2MP,
I believe you may just need to modify the check command and the script to make sure it's acting on the correct arguments. Looking over the command definition below. You are passing 4 arguments to service_restart.sh.
However, this script is checking state and if it's for downtime on the 4,5 and 6 arguments:
Let me know if you're able to get it working succesfully.
I believe you may just need to modify the check command and the script to make sure it's acting on the correct arguments. Looking over the command definition below. You are passing 4 arguments to service_restart.sh.
Code: Select all
$USER1$/service_restart.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$ $SERVICEDOWNTIME$ Try adjusting your script to:if [[ "$4" == "HARD" ]] && [[ "$5" == "0" ]] && [[ "$6" == "0" ]]
then
Code: Select all
if [[ "$1" == "HARD" ]] && [[ "$4" == "0" ]]
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Windows service restart using Nagios XI event handler
Hi,
We have made the changes as per your suggestion but, we its still not working for us.
The service is not even getting start in normal scenario nor even in scheduled downtime.
#!/bin/sh
logfile=/usr/local/nagios/libexec/eventhandler.log
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
if [[ "$1" == "HARD" ]] && [[ "$4" == "0" ]]
then
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c restart_service -a "$3" >> $logfile
fi
;;
esac
exit 0
This is the updated script.
We have made the changes as per your suggestion but, we its still not working for us.
The service is not even getting start in normal scenario nor even in scheduled downtime.
#!/bin/sh
logfile=/usr/local/nagios/libexec/eventhandler.log
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
if [[ "$1" == "HARD" ]] && [[ "$4" == "0" ]]
then
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c restart_service -a "$3" >> $logfile
fi
;;
esac
exit 0
This is the updated script.
Thanks & Regards,
I2MP Team.
I2MP Team.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Windows service restart using Nagios XI event handler
Hello @RIDS_I2MP,
1. Just to verify, have you tested the restart_service command line from the terminal and verified that it works?
2. If that's correct, there's a mistake in the script as the first argument will be either OK, WARNING, UNKNOWN or CRITICAL. For the "if" check on on state type, HARD or SOFT, in the script you'll need to adjust the service_restart command to call $SERVICESTATETYPE$.
You can view the full list of Nagios system macros and their output on the page below. This should help you get the arguments and system macros correctly set to meet your requirements.
Standard Macros in Nagios
By the way, we help customer find the right resources but custom script development is out of normal nagios support. Let me know if you get working.
1. Just to verify, have you tested the restart_service command line from the terminal and verified that it works?
Code: Select all
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c restart_service -a "$3"
You can view the full list of Nagios system macros and their output on the page below. This should help you get the arguments and system macros correctly set to meet your requirements.
Standard Macros in Nagios
By the way, we help customer find the right resources but custom script development is out of normal nagios support. Let me know if you get working.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Windows service restart using Nagios XI event handler
Hi Support team,
We have checked above steps and 1st step was working fine and accordingly we have modified the script.
[root@eu2napu003 libexec]# cat restart_service.sh
#!/bin/sh
logfile=/usr/local/nagios/libexec/eventhandler.log
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
if [[ "$4" == "HARD" ]] && [[ "$5" == "0" ]]
then
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c restart_service -a "$3" >> $logfile
fi
;;
esac
exit 0
[root@eu2napu003 libexec]#
We have checked the settings in GUI, Event trigger is enabled & also we have added the _command for above script.
$USER1$/restart_service.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$ $SERVICESTATETYPE$ $SERVICEDOWNTIME$
Its working when we invoke it via command line manually from nagios server. as shown below:
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time HARD 0 -- Should restart service -- and restarting as well
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time SOFT 1 -- script not triggered as per condition
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time SOFT 0 -- script not triggered as per condition
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time SOFT 0 -- script not triggered as per condition
Note: W32Time -- is servicename that we are restarting.
But, when I checked from GUI, seems script is not getting triggered as per the log file updates.
Could you please help us what could be possible reason.?
We have checked above steps and 1st step was working fine and accordingly we have modified the script.
[root@eu2napu003 libexec]# cat restart_service.sh
#!/bin/sh
logfile=/usr/local/nagios/libexec/eventhandler.log
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
if [[ "$4" == "HARD" ]] && [[ "$5" == "0" ]]
then
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c restart_service -a "$3" >> $logfile
fi
;;
esac
exit 0
[root@eu2napu003 libexec]#
We have checked the settings in GUI, Event trigger is enabled & also we have added the _command for above script.
$USER1$/restart_service.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$ $SERVICESTATETYPE$ $SERVICEDOWNTIME$
Its working when we invoke it via command line manually from nagios server. as shown below:
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time HARD 0 -- Should restart service -- and restarting as well
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time SOFT 1 -- script not triggered as per condition
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time SOFT 0 -- script not triggered as per condition
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time SOFT 0 -- script not triggered as per condition
Note: W32Time -- is servicename that we are restarting.
But, when I checked from GUI, seems script is not getting triggered as per the log file updates.
Could you please help us what could be possible reason.?
Thanks & Regards,
I2MP Team.
I2MP Team.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Windows service restart using Nagios XI event handler
Hi @RIDS_I2MP,
Is the service entering a HARD critical state that would trigger the event handler to run. You can force this by sending passive check results to Nagios from the web interface. Just go to Home > Service Status and then use the search box in the upper right to find the service. Once found click on the service and select the Advanced Tab and find the Submit Passive Check Results link from the commands box.
Can you post the modified check command to verify you have the correct system macros?
Is the service entering a HARD critical state that would trigger the event handler to run. You can force this by sending passive check results to Nagios from the web interface. Just go to Home > Service Status and then use the search box in the upper right to find the service. Once found click on the service and select the Advanced Tab and find the Submit Passive Check Results link from the commands box.
Can you post the modified check command to verify you have the correct system macros?
You do not have the required permissions to view the files attached to this post.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Windows service restart using Nagios XI event handler
Hi Support team,
We stopped the service manually and we tried by sending the passive result check but the service is not coming up automatically even though the event handler is enabled from GUI. But, if we invoke the script by providing manual command then its restarting the service.
Putty Command:
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time HARD 0 -- stopped service restarted 10.147.209.40.
GUI command:
$USER1$/restart_service.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$ $SERVICESTATETYPE$ $SERVICEDOWNTIME$ -- Not working
I request you to please suggest why its not working.
Thanks
RIDS Team
We stopped the service manually and we tried by sending the passive result check but the service is not coming up automatically even though the event handler is enabled from GUI. But, if we invoke the script by providing manual command then its restarting the service.
Putty Command:
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time HARD 0 -- stopped service restarted 10.147.209.40.
GUI command:
$USER1$/restart_service.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$ $SERVICESTATETYPE$ $SERVICEDOWNTIME$ -- Not working
I request you to please suggest why its not working.
Thanks
RIDS Team
Thanks & Regards,
I2MP Team.
I2MP Team.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Windows service restart using Nagios XI event handler
Hi RIDS_I2MP,
What are the permissions on the event handler script?
What happens when switch to the nagios user account and run the script?
You are probably logged as root when you are testing the command, and the nagios user may not have permissions to execute the script.We stopped the service manually and we tried by sending the passive result check but the service is not coming up automatically even though the event handler is enabled from GUI. But, if we invoke the script by providing manual command then its restarting the servic
What are the permissions on the event handler script?
Code: Select all
ls -l /usr/local/nagios/libexec/restart_service.shCode: Select all
su nagios
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time HARD 0
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Windows service restart using Nagios XI event handler
Hi,
Nagios user has required permission to execute the script.
[nagios@eu2napu003 libexec]$ ls -l /usr/local/nagios/libexec/restart_service.sh
-rwxrwxr-x 1 apache nagios 277 Aug 28 13:25 /usr/local/nagios/libexec/restart_service.sh
Nagios user has required permission to execute the script.
[nagios@eu2napu003 libexec]$ ls -l /usr/local/nagios/libexec/restart_service.sh
-rwxrwxr-x 1 apache nagios 277 Aug 28 13:25 /usr/local/nagios/libexec/restart_service.sh
Thanks & Regards,
I2MP Team.
I2MP Team.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Windows service restart using Nagios XI event handler
Hello @RIDS_I2MP,
Well, I'd like to review the nagios logs to make sure the script is getting called upon a HARD critical state change and any error messages.
Please PM your system profile. Thanks.
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message and then reply to this post to bring it up in the queue.
Well, I'd like to review the nagios logs to make sure the script is getting called upon a HARD critical state change and any error messages.
Please PM your system profile. Thanks.
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message and then reply to this post to bring it up in the queue.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!