Need help with eventhandlers to restart services
-
- Posts: 2
- Joined: Thu Feb 11, 2016 11:49 am
Need help with eventhandlers to restart services
I'm getting started with Nagios and I want to learn some specific and interesting things to expose them on class such as how event handlers works and implementing one or a few examples about it. More specifically, I thought about try to automatically restart services when they're not working fine/change to HARD state on my monitorized Windows 7 host or even after on my Debian 7.8 host if everything goes fine.
The official Nagios Documentation has a PDF here about it but they just only show you to do it with the Nagios XI Web Interface and I just have installed Nagios Core and I can't follow most of the steps. In an attempt to frustration, I tried to install Nagios XI but I canceled it because a Warning appeared showing that that type of installation was only for CentOS/Red Hat and it may cause troubles if Nagios Core was already installed.
Besides of that, the official Nagios Event Handlers documentation doesn't help me because the example restarting the HTTP service is not complete and there aren't any external pages of examples which can help me because most of them are really old and/or I don't even understand what sometimes they're doing.
I'll be very greatful If someone of you can show me a full example about how to implement the restart of a service using a Nagios Event Handler on a Windows host monitorized using Nagios Core. I specially don't understand in which file I should set up the event handler and which commands I need to use, I just worked a bit with check_nrpe, check_snmp, check_nt and commands who allows me to show the CPU Load, Memory Usage, etc
I'm running Nagios Core 4.1.1 on a Debian 7.8 VirtualBox machine with Nagios Plugins, SNMP and NRPE monitorizing a remote Windows 7 host with NSClient++ 0.4.4.15 (NRPE and SNMP installed & enabled).
The official Nagios Documentation has a PDF here about it but they just only show you to do it with the Nagios XI Web Interface and I just have installed Nagios Core and I can't follow most of the steps. In an attempt to frustration, I tried to install Nagios XI but I canceled it because a Warning appeared showing that that type of installation was only for CentOS/Red Hat and it may cause troubles if Nagios Core was already installed.
Besides of that, the official Nagios Event Handlers documentation doesn't help me because the example restarting the HTTP service is not complete and there aren't any external pages of examples which can help me because most of them are really old and/or I don't even understand what sometimes they're doing.
I'll be very greatful If someone of you can show me a full example about how to implement the restart of a service using a Nagios Event Handler on a Windows host monitorized using Nagios Core. I specially don't understand in which file I should set up the event handler and which commands I need to use, I just worked a bit with check_nrpe, check_snmp, check_nt and commands who allows me to show the CPU Load, Memory Usage, etc
I'm running Nagios Core 4.1.1 on a Debian 7.8 VirtualBox machine with Nagios Plugins, SNMP and NRPE monitorizing a remote Windows 7 host with NSClient++ 0.4.4.15 (NRPE and SNMP installed & enabled).
Re: Need help with eventhandlers to restart services
I'm going to run through the instructions that you posted, https://assets.nagios.com/downloads/nag ... ios-XI.pdf - but, dictated more towards Core. Some of the things will be the same. The official documentation will work for XI / Core, and if you get lost refer back to it.
Now, we need to create a bash script called servicerestart.sh located in the /usr/local/nagios/libexec/ directory.
Change the permissions so that it will work with Nagios -
Now, modify your commands.cfg and add this -$USER1$/servicerestart.sh $SERVICESTATE$
$HOSTADDRESS$ $_SERVICESERVICE$
Lastly, edit your service definition, and add this part to it -
Restart Nagios, and it should be working. Let me know if you run into any issues.
OK - we've created a script for NSClient++ to execute. Time to add it to the NSClient++ configuration.Create A Batch File For The Check
Open your favorite text editor and paste in the following code:
@echo off
net stop %1
net start %1
@exit 0
Or, download it using this link:
http://assets.nagios.com/downloads/nagi ... runcmd.bat
Once completed, save it as a batch file runcmd.bat in your NSClient++'s scripts directory, usually
c:\program files\NSClient++\scripts
Now, we've added the configuration so that NSClient++ can pick up on runcmd.bat. Now navigate to your services.msc, and restart the NSClient++ service (nscp if you prefer to do it over cmd). Let's test this prior to adding it to an event handler.Add the following string to the list of External Scripts:
runcmd=scripts\runcmd.bat "$ARG1$"
Also, verify that
allow_arguments=1
. If this variable is not set to 1, you will not be able to pass arguments to your scripts, then
save
the .ini file.
Run the above commands on your Nagios machine, and it should restart the spooler service.cd /usr/local/nagios/libexec
./check_nrpe -H <Window Host IP Address> -p 5666 -c runcmd -a spooler
Now, we need to create a bash script called servicerestart.sh located in the /usr/local/nagios/libexec/ directory.
Code: Select all
#!/bin/sh
# Event Handler for Restarting Windows Services
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c runcmd -a "$3"
;;
esac
exit 0
Code: Select all
chown nagios:nagios /usr/local/nagios/libexec/servicerestart.sh
chmod 775 /usr/local/nagios/libexec/servicerestart.sh
$HOSTADDRESS$ $_SERVICESERVICE$
Code: Select all
# 'service_restart' command definition
define command{
command_name service_restart
command_line $USER1$/servicerestart.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$
}
Code: Select all
event_handler service_restart
Former Nagios Employee
-
- Posts: 2
- Joined: Thu Feb 11, 2016 11:49 am
Re: Need help with eventhandlers to restart services
Thanks, really appreciated. I didn't have any troubles following the steps but at the end of it, when you tell me to edit my service definition i'm not pretty sure about how to set it up and if I should do it as a service or a host (as they did in the Nagios XI PDF). I tried to add a new service definition at the end of my /usr/local/nagios/etc/objects/windows.cfg file with the next config:rkennedy wrote:I'm going to run through the instructions that you posted, https://assets.nagios.com/downloads/nag ... ios-XI.pdf - but, dictated more towards Core. Some of the things will be the same. The official documentation will work for XI / Core, and if you get lost refer back to it.
OK - we've created a script for NSClient++ to execute. Time to add it to the NSClient++ configuration.Create A Batch File For The Check
Open your favorite text editor and paste in the following code:
@echo off
net stop %1
net start %1
@exit 0
Or, download it using this link:
http://assets.nagios.com/downloads/nagi ... runcmd.bat
Once completed, save it as a batch file runcmd.bat in your NSClient++'s scripts directory, usually
c:\program files\NSClient++\scripts
Now, we've added the configuration so that NSClient++ can pick up on runcmd.bat. Now navigate to your services.msc, and restart the NSClient++ service (nscp if you prefer to do it over cmd). Let's test this prior to adding it to an event handler.Add the following string to the list of External Scripts:
runcmd=scripts\runcmd.bat "$ARG1$"
Also, verify that
allow_arguments=1
. If this variable is not set to 1, you will not be able to pass arguments to your scripts, then
save
the .ini file.
Run the above commands on your Nagios machine, and it should restart the spooler service.cd /usr/local/nagios/libexec
./check_nrpe -H <Window Host IP Address> -p 5666 -c runcmd -a spooler
Now, we need to create a bash script called servicerestart.sh located in the /usr/local/nagios/libexec/ directory.Change the permissions so that it will work with Nagios -Code: Select all
#!/bin/sh # Event Handler for Restarting Windows Services case "$1" in OK) ;; WARNING) ;; UNKNOWN) ;; CRITICAL) /usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c runcmd -a "$3" ;; esac exit 0
Now, modify your commands.cfg and add this -$USER1$/servicerestart.sh $SERVICESTATE$Code: Select all
chown nagios:nagios /usr/local/nagios/libexec/servicerestart.sh chmod 775 /usr/local/nagios/libexec/servicerestart.sh
$HOSTADDRESS$ $_SERVICESERVICE$Lastly, edit your service definition, and add this part to it -Code: Select all
# 'service_restart' command definition define command{ command_name service_restart command_line $USER1$/servicerestart.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$ }
Restart Nagios, and it should be working. Let me know if you run into any issues.Code: Select all
event_handler service_restart
Code: Select all
define service{
use generic-service
host_name Windows 50
service_description Restart Windows Service
check_command check-host-alive
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
register 0
event_handler service_restart
}
Re: Need help with eventhandlers to restart services
Can you post the result of verifying your config file? /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Former Nagios Employee
Re: Need help with eventhandlers to restart services
Hi, guys... i have a doubt with this script... it works in my environment, but i cant find the $_SERVICESERVICE$ variable definition... I was looking in this link https://assets.nagios.com/downloads/nag ... rvicestate but i couldn find it... can you tell me where can i find or where must i do the $_SERVICESERVICE$ variable definition?
Thank you!
Thank you!
Re: Need help with eventhandlers to restart services
$_SERVICESERVICE$ is a custom variable that this uses, it won't be listed on the macro list.
What are you looking to accomplish?
What are you looking to accomplish?
Former Nagios Employee
Re: Need help with eventhandlers to restart services
I´m trying to understand the language... coding... and the samples... but in the sample, i cant see where the variable is defined... can you tell me where?
I´m really newbie at nagios, and the help is so confused...
I´m really newbie at nagios, and the help is so confused...
Re: Need help with eventhandlers to restart services
But, think about it, i really need a script that permit identify when a http page return a http 500 error, and restart the service... if the page still show the http 500 error, send an email to the woork team, with the sh script... but i'm thinking about how to do it...
Re: Need help with eventhandlers to restart services
General approach for doing what you want with event handlers and notification:
HOWEVER, if the restart worked, the next Nagios check will go back to an OK state and no notifications will have been sent out. Optionally, you can send an email from with your script to let people know that it was restarted if you want to. If you try to coordinate all of this from within the script itself, it gets very complex.
- Let's assume Nagios checks every five minutes to see if you get a 500 error
- Every check fires the event handler, so you teach it to exit without doing anything if the result is "OK"
- When there is a 500 error, the event handler knows that this is the 1st SOFT CRITICAL
- Meanwhile, Nagios starts checking every minute (let's assume) and will check for five total attempts before notifying
- Send a command to Nagios to stop checking the service (this is an easy way to prevent checks from checking while your event handler is trying to restart something)
- Try to restart the server (may require SSH or other remote access if the server is remote)
- When done (or when it can't fix it) it sends Nagios the command to start doing service checks again
- Optionally, teach your script to try again on the second SOFT CRITICAL
HOWEVER, if the restart worked, the next Nagios check will go back to an OK state and no notifications will have been sent out. Optionally, you can send an email from with your script to let people know that it was restarted if you want to. If you try to coordinate all of this from within the script itself, it gets very complex.
Re: Need help with eventhandlers to restart services
Thanks eloyd!
andresfvs, let us know if you have any additional questions.
Thank you
andresfvs, let us know if you have any additional questions.
Thank you