Page 1 of 2

Windows service restart using Nagios XI event handler

Posted: Mon Aug 26, 2019 8:17 am
by RIDS_I2MP
Hi Team,

We want to use Nagios XI event handler feature to start the Windows services when they go down. We were able to achieve it using below document.

https://assets.nagios.com/downloads/nag ... ios-XI.pdf

But, we noticed that the services were getting restart during the downtime as well, we do not want the services to start during downtime.

We checked few topics in Nagios Forum related to the same, below are the ones we referred for the same:

https://support.nagios.com/forum/viewto ... 16&t=39864
https://support.nagios.com/forum/viewto ... =6&t=46993

And, we changed our script like below:

#!/bin/sh
#$1 defines service state, $2 defines hostname, $3 defines servicename to restart on windows server ,$4 defines servicestate, $5 defines hostdowntime, $6 defines servicedowntime

logfile=/usr/local/nagios/libexec/eventhandler.log
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
if [[ "$4" == "HARD" ]] && [[ "$5" == "0" ]] && [[ "$6" == "0" ]]
then
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c restart_service -a "$3" >> $logfile
echo ServiceState $4 >> $logfile
echo HostDownTime $5 >> $logfile
echo ServiceDownTime $6 >> $logfile
fi
;;
esac
exit 0

As per the posts, the above script will not start the Windows services when it is in downtime.

We used below command definition:

$USER1$/service_restart.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$ $SERVICEDOWNTIME$

I am not sure how to use all the arguments in command definition and it seems that is the reason why the script is not working for us.

Can you please help me to write the exact command definition?

Also, please let me know if I have missed anything that is preventing the script to run properly.

Thanks in advance!!

Re: Windows service restart using Nagios XI event handler

Posted: Mon Aug 26, 2019 2:24 pm
by benjaminsmith
Hello @RIDS_I2MP,

I believe you may just need to modify the check command and the script to make sure it's acting on the correct arguments. Looking over the command definition below. You are passing 4 arguments to service_restart.sh.

Code: Select all

$USER1$/service_restart.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$ $SERVICEDOWNTIME$ 
However, this script is checking state and if it's for downtime on the 4,5 and 6 arguments:
if [[ "$4" == "HARD" ]] && [[ "$5" == "0" ]] && [[ "$6" == "0" ]]
then
Try adjusting your script to:

Code: Select all

if [[ "$1" == "HARD" ]] && [[ "$4" == "0" ]]
Let me know if you're able to get it working succesfully.

Re: Windows service restart using Nagios XI event handler

Posted: Tue Aug 27, 2019 9:54 am
by RIDS_I2MP
Hi,

We have made the changes as per your suggestion but, we its still not working for us.
The service is not even getting start in normal scenario nor even in scheduled downtime.

#!/bin/sh

logfile=/usr/local/nagios/libexec/eventhandler.log
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
if [[ "$1" == "HARD" ]] && [[ "$4" == "0" ]]
then
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c restart_service -a "$3" >> $logfile
fi
;;
esac
exit 0

This is the updated script.

Re: Windows service restart using Nagios XI event handler

Posted: Tue Aug 27, 2019 4:07 pm
by benjaminsmith
Hello @RIDS_I2MP,

1. Just to verify, have you tested the restart_service command line from the terminal and verified that it works?

Code: Select all

/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c restart_service -a "$3"
2. If that's correct, there's a mistake in the script as the first argument will be either OK, WARNING, UNKNOWN or CRITICAL. For the "if" check on on state type, HARD or SOFT, in the script you'll need to adjust the service_restart command to call $SERVICESTATETYPE$.

You can view the full list of Nagios system macros and their output on the page below. This should help you get the arguments and system macros correctly set to meet your requirements.

Standard Macros in Nagios

By the way, we help customer find the right resources but custom script development is out of normal nagios support. Let me know if you get working.

Re: Windows service restart using Nagios XI event handler

Posted: Wed Aug 28, 2019 8:43 am
by RIDS_I2MP
Hi Support team,
We have checked above steps and 1st step was working fine and accordingly we have modified the script.

[root@eu2napu003 libexec]# cat restart_service.sh
#!/bin/sh

logfile=/usr/local/nagios/libexec/eventhandler.log
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
if [[ "$4" == "HARD" ]] && [[ "$5" == "0" ]]
then
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c restart_service -a "$3" >> $logfile
fi
;;
esac
exit 0

[root@eu2napu003 libexec]#

We have checked the settings in GUI, Event trigger is enabled & also we have added the _command for above script.

$USER1$/restart_service.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$ $SERVICESTATETYPE$ $SERVICEDOWNTIME$

Its working when we invoke it via command line manually from nagios server. as shown below:
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time HARD 0 -- Should restart service -- and restarting as well
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time SOFT 1 -- script not triggered as per condition
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time SOFT 0 -- script not triggered as per condition
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time SOFT 0 -- script not triggered as per condition

Note: W32Time -- is servicename that we are restarting.

But, when I checked from GUI, seems script is not getting triggered as per the log file updates.

Could you please help us what could be possible reason.?

Re: Windows service restart using Nagios XI event handler

Posted: Wed Aug 28, 2019 1:24 pm
by benjaminsmith
Hi @RIDS_I2MP,

Is the service entering a HARD critical state that would trigger the event handler to run. You can force this by sending passive check results to Nagios from the web interface. Just go to Home > Service Status and then use the search box in the upper right to find the service. Once found click on the service and select the Advanced Tab and find the Submit Passive Check Results link from the commands box.
service-status-detail.png
Can you post the modified check command to verify you have the correct system macros?

Re: Windows service restart using Nagios XI event handler

Posted: Thu Aug 29, 2019 5:37 am
by RIDS_I2MP
Hi Support team,

We stopped the service manually and we tried by sending the passive result check but the service is not coming up automatically even though the event handler is enabled from GUI. But, if we invoke the script by providing manual command then its restarting the service.

Putty Command:
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time HARD 0 -- stopped service restarted 10.147.209.40.

GUI command:
$USER1$/restart_service.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$ $SERVICESTATETYPE$ $SERVICEDOWNTIME$ -- Not working

I request you to please suggest why its not working.

Thanks
RIDS Team

Re: Windows service restart using Nagios XI event handler

Posted: Thu Aug 29, 2019 2:03 pm
by benjaminsmith
Hi RIDS_I2MP,
We stopped the service manually and we tried by sending the passive result check but the service is not coming up automatically even though the event handler is enabled from GUI. But, if we invoke the script by providing manual command then its restarting the servic
You are probably logged as root when you are testing the command, and the nagios user may not have permissions to execute the script.

What are the permissions on the event handler script?

Code: Select all

ls -l /usr/local/nagios/libexec/restart_service.sh
What happens when switch to the nagios user account and run the script?

Code: Select all

su nagios
/usr/local/nagios/libexec/restart_service.sh CRITICAL 10.147.209.40 W32Time HARD 0

Re: Windows service restart using Nagios XI event handler

Posted: Fri Aug 30, 2019 9:02 am
by RIDS_I2MP
Hi,

Nagios user has required permission to execute the script.

[nagios@eu2napu003 libexec]$ ls -l /usr/local/nagios/libexec/restart_service.sh
-rwxrwxr-x 1 apache nagios 277 Aug 28 13:25 /usr/local/nagios/libexec/restart_service.sh

Re: Windows service restart using Nagios XI event handler

Posted: Fri Aug 30, 2019 1:00 pm
by benjaminsmith
Hello @RIDS_I2MP,

Well, I'd like to review the nagios logs to make sure the script is getting called upon a HARD critical state change and any error messages.

Please PM your system profile. Thanks.

To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message and then reply to this post to bring it up in the queue.