A couple seemingly basic issues I can't seem to overcome...

jbwaclawski · Post by **jbwaclawski** » Mon Dec 10, 2012 3:14 pm

I'm working with the Nagios XI trial VM image and have encountered a few obnoxious issues..

1) I am continuously receiving the following error in my NSClient++ log file (version 0.3.9.330 2011-09-02 x64):

Code: Select all

error:include\Socket.h:713: Error: Could not complete SSL handshake : [-1] 1, attempting to resume...

I don't know what to try with this aside from enabling/disabling SSL. Any thoughts would be very helpful.

2) Event Handlers just do not work, or I'm missing a very small, very crucial step. I've followed the extremely basic configuration found here: http://assets.nagios.com/downloads/nagi ... h_NRPE.pdf . My .bat file works on Windows and my shell script works from linux when I feed in ./servicerestart.sh "CRITICAL" 192.168.1.103 Spooler, but the service monitor just won't trigger the damn thing during any states at all. I don't know if it's something to do with my SSL or not.

Setup:
>> Nagios XI VM image, running CentOS 6.3, hosted in VirtualBox
>> Time on both NagiosXI and client sync'd to same public NTP pool
>> Client is Windows 7 x64, missing a few updates
>> Using NRPE within NSClient++ v. 0.3.9.330 2011-09-02 x64 (./check_nrpe -H 192.168.1.103 works fine)

There's a start, thanks for the help and let me know if you have any questions.

jbwaclawski · Post by **jbwaclawski** » Mon Dec 10, 2012 3:55 pm

Add onto that this:

3) The monitor I created to poll the Print Spooler won't Sync (Core Configuration Manager >> Services >> Service Status). I don't know why. I've tried deleting and recreating the monitor, restarting Nagios, and restarting the server completely. Nothing is working. I'm not sure if this links to either of the other problems, but it's driving me nuts.

slansing · Post by **slansing** » Mon Dec 10, 2012 4:11 pm

1) Check your firewall, make sure that TCP/UDP 5666 are open on both your Nagios server and the destination IP.

2) If you are unable to get anything checks to go through to your Windows server this will not work. But you did mention that the following works:

Code: Select all

./check_nrpe -H xxx.xxx.xx.x -p 5666 -c runcmd -a spooler

Correct?

jbwaclawski · Post by **jbwaclawski** » Mon Dec 10, 2012 10:56 pm

Yeah, here's my test setup:

I have the Nagios XI trial VM installed in Virtualbox running on my desktop. I have all firewalls and AV applications on my host (desktop) turned off and/or disabled so as to limit possible blockages.

I can do the basic check to my desktop from my VM with no issue...

Code: Select all

./check_nrpe -H host_ip

...I can even activate the remote script without issue as well as pass arguments to it...

Code: Select all

./check_nrpe -H host_ip -c runcmd

Code: Select all

./check_nrpe -H host_ip -c runcmd -a Spooler

I then created a script within the VM to use as the trigger of my event handler (code below). This script even works when I pass arguments to it...

Code: Select all

./servicerestart.sh "CRITICAL" 192.168.1.103 Spooler

After all of the manual testing was done I figured I was ready to test the application's ability to trigger an event handler, so I created a service monitor. Once I had it monitoring Spooler and received data showing that it was up/down properly, I moved back to CCM to add the event handler information. I made a command that read as such:

Code: Select all

$USER1$/servicerestart.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICE$

..set it as a misc command, made it Active, and saved the operation. Once back on the commands screen, I applied the configuration to be safe. I then moved to Services and began editing the previous monitor I created for the Print Spooler, selecting my newly created Service Restart event handler as it's event handler, turning them on, and moving to create a variable definition of _SERVICE = Spooler. Save all of that, apply the config and move to testing.

At this point I manually stop the Spooler service on my desktop, and let Nagios find out on it's own. It waits, it checks, it detects that it's down, but no event handler launches. I let it go through tries 1-5 and still, nothing. I check var/nagios.log and find nothing strange, other than the fact that no event handler information is in there (not sure if it's supposed to or not). I checked nsclient.log and found all of the SSL errors. Those may have something to do with it, but I don't know for sure. There's not a lot of documentation out there regarding anything in Nagios or NSClient++. I tried going into nsc.ini and disabling SSL on NRPE (though it was already commented out) and trying again. That time it gave me the following error:

Code: Select all

message:modules\NRPEListener\NRPEListener.cpp:370: Could not read a full NRPE packet from socket, only got: 77

I also frequently see this error, with or without SSL on:

Code: Select all

error:CACHEmodules\NRPEListener\NRPEListener.cpp:70: No scripts found in path: scripts\*.*

..which is strange because my runcmd.bat file is in the NSClient++\scripts\ directory.

I finally noticed the "Failed to Sync" error on the services panel in the Core Configuration Manager after going back and checking all of my work. I have no idea what the Sync Status column is for in there, but my guess is that it MAY have something to do with all of this. I also have the feeling that once I fix one problem, the rest are going to start working as well. Sorry for the outrageous wall of text, but with something like this I figured I'd be as detailed as humanly possible so if there's any human error it can be pointed out.

servicerestart.sh:

Code: Select all

#!/bin/sh
# Event Handler for restarting Windows Services

case "$1" in
        OK)
                ;;
        WARNING)
                ;;
        UNKNOWN)
                ;;
        CRITICAL)
                /usr/local/nagios/libexec/check_nrpe -H "$2" -t 120 -c runcmd -a "$3"
        ;;
esac
exit 0

scottwilkerson · Post by **scottwilkerson** » Tue Dec 11, 2012 11:50 am

jbwaclawski wrote:After all of the manual testing was done I figured I was ready to test the application's ability to trigger an event handler, so I created a service monitor. Once I had it monitoring Spooler and received data showing that it was up/down properly, I moved back to CCM to add the event handler information. I made a command that read as such:
Code: Select all
$USER1$/servicerestart.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICE$
.

This was close, but it should be

Code: Select all

$USER1$/servicerestart.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$

Nagios Support Forum

A couple seemingly basic issues I can't seem to overcome...

A couple seemingly basic issues I can't seem to overcome...

Re: A couple seemingly basic issues I can't seem to overcome

Re: A couple seemingly basic issues I can't seem to overcome

Re: A couple seemingly basic issues I can't seem to overcome

Re: A couple seemingly basic issues I can't seem to overcome