Page 6 of 10

Re: Using NRPE for windows server

PostPosted: Mon Feb 12, 2018 3:35 pm
by npolovenko
@skypete, Good. So aside from not being able to automatically restart a service on the windows server, does the service check work as expected in the web interface? By that I mean it correctly turns green if the windows service is up and it turns red when the windows service is down? If the answer is yes, give your event handler /usr/local/nagios/libexec/restart_service.sh permissions:
Code: Select all
chmod +x /usr/local/nagios/libexec/restart_service.sh
Then open the script and make it do something basic, like create a text file:
Code: Select all
#!/bin/sh
echo "Event handler works!" > testing.txt
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c restart_service -a "$3"
;;
esac
exit 0


So when nagios calls the event handler this line of code:
Code: Select all
echo "Event handler works!" > testing.txt

Will create a text file in /usr/local/nagios/libexec/ folder.
So let me know if the event handler creates this file for you.

Also, keep in mind that when we defined the service, we added this options:
Code: Select all
  max_check_attempts      5
    check_interval          5


This means once the service is down, Nagios will try checking it 5 more times with 5 min interval before calling the event handler. So either change this and restart nagios service, or keep the service down for 30 minutes before checking in on the result.
Please send me the screenshot from the Nagios Core web interface of this service in a critical state.

Re: Using NRPE for windows server

PostPosted: Mon Feb 12, 2018 4:44 pm
by skypete
I added all the commands and yes the service checks works as expected in the web interface. it seems to create the testing.txt but no service restart would you want me to send my cfg file so you could take a look at it?



npolovenko wrote:@skypete, Good. So aside from not being able to automatically restart a service on the windows server, does the service check work as expected in the web interface? By that I mean it correctly turns green if the windows service is up and it turns red when the windows service is down? If the answer is yes, give your event handler /usr/local/nagios/libexec/restart_service.sh permissions:
Code: Select all
chmod +x /usr/local/nagios/libexec/restart_service.sh
Then open the script and make it do something basic, like create a text file:
Code: Select all
#!/bin/sh
echo "Event handler works!" > testing.txt
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c restart_service -a "$3"
;;
esac
exit 0


So when nagios calls the event handler this line of code:
Code: Select all
echo "Event handler works!" > testing.txt

Will create a text file in /usr/local/nagios/libexec/ folder.
So let me know if the event handler creates this file for you.

Also, keep in mind that when we defined the service, we added this options:
Code: Select all
  max_check_attempts      5
    check_interval          5


This means once the service is down, Nagios will try checking it 5 more times with 5 min interval before calling the event handler. So either change this and restart nagios service, or keep the service down for 30 minutes before checking in on the result.
Please send me the screenshot from the Nagios Core web interface of this service in a critical state.

Re: Using NRPE for windows server

PostPosted: Mon Feb 12, 2018 5:54 pm
by npolovenko
@skypete, I made a typo earlier and I'm not sure you copied that over, so lets make sure to fix it. In commands.cfg change:
Code: Select all
define command {
    command_name     restart-service
    command_line    $USER1$/restart_service.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$T$
}

To:
Code: Select all
define command {
    command_name     restart-service
    command_line    $USER1$/restart_service.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$
}


Next, add the _SERVICE line to your service definition:
Code: Select all
define service {
host_name HostName
service_description Print Spooler
check_command check_nrpe!check_process!process=spoolsv.exe!show-all
max_check_attempts 5
event_handler restart-service
check_interval 5
retry_interval 1
check_period 24x7
notification_interval 60
notification_period 24x7
contacts nagiosadmin
_SERVICE  "spoolsv.exe"
}


Restart nagios:
Code: Select all
service nagios restart


Let me know if that got it to work!

Re: Using NRPE for windows server

PostPosted: Wed Feb 14, 2018 3:06 pm
by skypete
npolovenko wrote:@skypete, I made a typo earlier and I'm not sure you copied that over, so lets make sure to fix it. In commands.cfg change:
Code: Select all
define command {
    command_name     restart-service
    command_line    $USER1$/restart_service.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$T$
}

To:
Code: Select all
define command {
    command_name     restart-service
    command_line    $USER1$/restart_service.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$
}


Next, add the _SERVICE line to your service definition:
Code: Select all
define service {
host_name HostName
service_description Print Spooler
check_command check_nrpe!check_process!process=spoolsv.exe!show-all
max_check_attempts 5
event_handler restart-service
check_interval 5
retry_interval 1
check_period 24x7
notification_interval 60
notification_period 24x7
contacts nagiosadmin
_SERVICE  "spoolsv.exe"
}


Restart nagios:
Code: Select all
service nagios restart


Let me know if that got it to work!


Thanks for your help i did add those commands. I checked the windows server and it did not restart the service it still shows it as stopped, Another thing you asked when the service turns green if the windows service is up and it turns red when the windows service is down the print spooler service stays green on the nagios ui but the service or daemon checks and process check are changed to red. Here is my service commands and screens shots of the windows server and nagios web UI. Thanks


define service{
host_name hostname
service_description Service or Daemon Checks
check_command check_nrpe!check_service -a service=spooler
max_check_attempts 5
check_interval 5
retry_interval 1
check_period 24x7
notification_interval 60
notification_period 24x7
contacts nagiosadmin
}


define service {
host_name hostname
service_description Print Spooler
check_command check_nrpe!check_process!process=spoolsv.exe!show-all
max_check_attempts 5
event_handler restart-service
check_interval 5
retry_interval 1
check_period 24x7
notification_interval 60
notification_period 24x7
contacts nagiosadmin
_SERVICE "spoolsv.exe"
}

define service {
use generic-service
host_name hostname
service_description Process Checks
check_command check_nrpe!check_process -a process=spoolsv.exe show-all
max_check_attempts 5
event_handler restart-service
check_interval 5
retry_interval 1
check_period 24x7
notification_interval 60
notification_period 24x7
contacts nagiosadmin
_SERVICE "spoolsv.exe"
}

Re: Using NRPE for windows server

PostPosted: Thu Feb 15, 2018 3:08 pm
by skypete
Any Luck? also here is my logs for that service hope this helps.

;Print Spooler;CRITICAL;SOFT;1;CHECK_NRPE STATE CRITICAL: Socket timeout after 30 seconds.
[1518722060] SERVICE EVENT HANDLER: SERVERHOST1;Print Spooler;CRITICAL;SOFT;1;restart-service
[1518722060] wproc: SERVICE EVENTHANDLER job 4 from worker Core Worker 3667 is a non-check helper but exited with return code 2
[1518722060] wproc: early_timeout=0; exited_ok=1; wait_status=512; error_code=0;
[1518722060] wproc: stderr line 01: execvp(/usr/local/nagios/libexec/restart_service.sh, ...) failed. errno is 2: No such file or directory

Re: Using NRPE for windows server

PostPosted: Thu Feb 15, 2018 3:51 pm
by npolovenko
@skypete, Well your check in Core has a 5-minute interval. Did you wait 5 minutes for it to turn red? Also, some processes in windows have auto recovery option enabled, that means if the process goes down It'll automatically restart it -> maybe that's why the service in the UI was green? Has the check ever worked -> represented the actual state of a windows process?

[1518722060] wproc: stderr line 01: execvp(/usr/local/nagios/libexec/restart_service.sh, ...) failed. errno is 2: No such file or directory

Hmm, why'd you have this error. Can you cd into that directory and chmod a+x restart_service.sh

Also, since you said this worked -> testing.txt got created:
Code: Select all
#!/bin/sh
echo "Event handler works!" > testing.txt
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c restart_service -a "$3"
;;
esac

Let's take it one step further and do this:
Code: Select all
#!/bin/sh
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c restart_service -a "$3"
echo "/usr/local/nagios/libexec/check_nrpe -H \"$2\" -p 5666 -c restart_service -a \"$3\"" > testing.txt

;;
esac


Please make the service go into the critical state again and then upload the testing.txt file so that we can see what's going on.

Re: Using NRPE for windows server

PostPosted: Thu Feb 15, 2018 3:57 pm
by skypete
Hi and thanks again the service never worked nor has it ever been in any critical state nor has it represented the actual state of a windows process ever and it does not go red ever. When I stop the service from the windows servers. It will only work when i ran the /usr/local/nagios/libexec/check_nrpe -H (IP Adress of server) -c restart_service -a spooler than it will go back to its current state.

Has for the testing.txt all it has in the file is Event Handler Works! at the moment.
another thing when I run chown apache:nagios /usr/local/nagios/libexec/restart_service.sh i get this error chown: invlaid user: apache:nagios

Re: Using NRPE for windows server

PostPosted: Thu Feb 15, 2018 4:18 pm
by npolovenko
On windows server open the command line as administrator, run:
Code: Select all
net stop Spooler

Then from Nagios server run and show me the output of:
Code: Select all
./check_nrpe -H windows_server_ip -c check_process -a process=spoolsv.exe show-all

Then on windows server:
Code: Select all
net start Spooler

Then from Nagios server run and show me the output of:
Code: Select all
./check_nrpe -H windows_server_ip -c check_process -a process=spoolsv.exe show-all


I tested and it all works fine for me.

Re: Using NRPE for windows server

PostPosted: Thu Feb 15, 2018 4:22 pm
by npolovenko
It will only work when i ran the /usr/local/nagios/libexec/check_nrpe -H (IP Adress of server) -c restart_service -a spooler than it will go back to its current state.

You're saying that running this command manually changes what you see in the UI? That is impossible.
when I run chown apache:nagios /usr/local/nagios/libexec/restart_service.sh i get this error chown: invlaid user: apache:nagios

You can try chown apache.nagios /usr/local/nagios/libexec/restart_service.sh

Re: Using NRPE for windows server

PostPosted: Thu Feb 15, 2018 4:26 pm
by skypete
root@ubuntu:/usr/local/nagios/libexec# ./check_nrpe -H IP ADDRESS -c check_process -a process=spoolsv.exe show-all
OK: spoolsv.exe=started|'spoolsv.exe state'=1;0;0 'count'=1;0;0