Page 1 of 5

NRPE: Automatic restart of multiple services

Posted: Wed Jul 08, 2015 3:08 pm
by mhixson2
I have followed this guide to successfully restart a Windows service if it stops using an event handler and NRPE check_service.

I have successfully defined a check_service to monitor several windows services in one service definition by adding several services in the service=[service name] argument (i.e. service=[service1] service=[service2] service=[service3], etc.). This has worked well in my testing.

I'm trying to get them working together on the same service, but I can't quite get it. I think the hang up is in the service setup > Misc Settings tab > Manage Variable Definitions. The guide has you define the one service you want the event handler to restart if it stops/reaches a critical state. Is there any way to define multiple? Does the script needs adjusted? Or is there a different way to achieve this?

Thanks

EDIT = clarified this is for a Windows host using NRPE

Re: NRPE: Automatic restart of multiple services

Posted: Wed Jul 08, 2015 3:38 pm
by ssax
You should be able to use the attached script I whipped up for you and set _SERVICE to a comma separated list of services like:

Code: Select all

process1,process2,process3
multi_service_restart.zip

Re: NRPE: Automatic restart of multiple services

Posted: Wed Jul 08, 2015 4:01 pm
by mhixson2
Thanks!

Works awesome but doesn't seem to handle service names with spaces. Do you know how to escape those spaces?

Re: NRPE: Automatic restart of multiple services

Posted: Wed Jul 08, 2015 4:40 pm
by abrist
Try wrapping all instances of %%F with quotes:

Code: Select all

@echo off
SET SERVICES=%1

:: Loop through services and restart them
:LOOP
 FOR /F "tokens=1,* delims=," "%%F" IN (%SERVICES%) DO (
net stop "%%F"
net start "%%F"
SET SERVICES="%%G"
GOTO LOOP
)

@exit 0
[/s]
Don't actually.

Re: NRPE: Automatic restart of multiple services

Posted: Thu Jul 09, 2015 10:55 am
by mhixson2
Hmm... looks like adding the quotes is causing an error.

I have dialed everything back to one service with no spaces to confirm everything is operational. On the Nagios server, I run this command from the libexec directory ./check_nrpe -H [host] -p 5666 -c restart_service -a Spooler and it returns error The command (restart_service) returned an invalid return code: 255|.

If I revert the batch script to have no quotes, it works fine.

BTW... this is awesome
VI VI VI - The editor of the Beast!

Re: NRPE: Automatic restart of multiple services

Posted: Thu Jul 09, 2015 11:10 am
by abrist
My bad, let me pull in [someone] who is a Batch A** mutha!

Re: NRPE: Automatic restart of multiple services

Posted: Thu Jul 09, 2015 11:14 am
by tmcdonald
Personally, I would do this in powershell instead so you can leverage things like echoargs:

http://stackoverflow.com/questions/1673 ... and-quotes

Re: NRPE: Automatic restart of multiple services

Posted: Thu Jul 09, 2015 11:26 am
by ssax
abrist's code was right, validated with mine, here is the updated version:
multi_service_restart.zip
Please try the attached file and post the full output of any errors that you get so that we can troubleshoot further.

Re: NRPE: Automatic restart of multiple services

Posted: Fri Jul 10, 2015 8:49 am
by mhixson2
Awesome, thanks guys.

The last script posted is doing the trick. I had to enclose the spacey service name in single quotes in the service definition (service='spacey service name') and double quotes in the variable definition ("spacey service name",otherservice,otherservice,etc.).

Looking at the script and analyzing the behavior on the server when a service is being auto-restarted, I see that it's actually restarting every service in the list when one fails. Unfortunately, this isn't going to work for us. Is it possible to set up logic in the batch to only restart the affected service? If not, can I get some help me with setting that up in powershell?

Thanks!

Re: NRPE: Automatic restart of multiple services

Posted: Fri Jul 10, 2015 9:11 am
by ssax
We should be able to tweak the command and/or the .sh script to parse the output to determine which one needs to be restarted.

Please post the full check (and the output) that you are running initially to check the services so I can see if it has what we need.