Page 1 of 2

Help with Nagios and NRPE Event Handlers

Posted: Wed Apr 16, 2014 12:48 pm
by sinaowolabi
Hi!

I have a few services that I am monitoring on different systems, and some of them just need to be restarted (for instance, strongswan VPNs) and my real problem of the day is a freeIPA/redhat IdM system whose dirvsrv instance just times out for no reason, and everything stops working til I kill ns-slapd and run "ipactl restart".

I wrote a small bash script that does just exactly that (killall -9 ns-slapd and ipactl stop && ipactl start), and I since I have the IPA server being monitored by nagios ldap_check and dns_check, I wanted to make sure that the script will run when there is a time out. I use NRPE on the remote (the IPA server) to monitor.
The IPA server and the Nagios server both run 64-bit RHEL6.5 with 2GB RAM. They are guests in different KVM hypervisors.

To this end I dropped the executable bash script into the /usr/lib64/nagios/plugins/eventhandlers/ directory on the IPA server, and created a line for the command in the /etc/nagios/nrpe.cfg file like so:

command[ipactl_restart]=/usr/lib64/nagios/plugins/eventhandlers/ipactl_restart.sh

I now went to the Nagios server and created a command for this, and also copied the bash script into its own event handler directory like so:

##ipactl restart command
define command{
command_name ipactl_restart
command_line $USER1$/eventhandlers/ipactl_restart.sh $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTADDRESS$
}


I now created added an eventhandler line to the service definition in the appropriate file in Nagios:

define service {
use generic-service
hostgroup_name IPAServers
service_description IPA Directory Service
check_command check_nrpe!check_ipa389
max_check_attempts 3
check_interval 5
retry_interval 1
check_period 24x7
notification_interval 60
notification_period 24x7
contacts admins
event_handler check_nrpe!ipactl_restart
}


I restarted Nagios but nothing happens, it only logs outage times, but no event handling!

[1397636010] SERVICE NOTIFICATION: nagiosadmin;services.example.com;IPA Directory Service;CRITICAL;notify-service-by-email;CHECK_NRPE: Socket timeout after 20 seconds.

Please can someone guide me through what I am doing wrong? I also tried a command_line with NRPE in it but that didnt work either:

#command_line $USER1$/check_nrpe -n -H $HOSTADDRESS$ -c /usr/lib64/nagios/plugins/eventhandlers/ipactl_restart.sh -a $ARG1$

Any assistance welcome, and thanks in advance!

Re: Help with Nagios and NRPE Event Handlers

Posted: Wed Apr 16, 2014 3:42 pm
by tmcdonald
Did you restart the NRPE server (or xinetd if it is run through that) after adding the command?

Re: Help with Nagios and NRPE Event Handlers

Posted: Wed Apr 16, 2014 5:47 pm
by sinaowolabi
Hi

Yes I did.

Re: Help with Nagios and NRPE Event Handlers

Posted: Thu Apr 17, 2014 9:52 am
by slansing
Where are you using:

Code: Select all

##ipactl restart command
define command{
command_name ipactl_restart
command_line $USER1$/eventhandlers/ipactl_restart.sh $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTADDRESS$
}
It looks like you are calling the nrpe defined command ipactl_restart through check_nrpe in your event handler section, not your actual event handler you defined above.

Re: Help with Nagios and NRPE Event Handlers

Posted: Thu Apr 17, 2014 1:16 pm
by sinaowolabi
Please, what is the proper way to do it?

Re: Help with Nagios and NRPE Event Handlers

Posted: Thu Apr 17, 2014 1:50 pm
by abrist
IS the script run from the nagios server, or from the remote host? If it needs to br run on the remote host, you need to move the script to the host and then create an nrpe command for it and the alter your nagios server check and command to run this through nrpe.

Re: Help with Nagios and NRPE Event Handlers

Posted: Tue Apr 22, 2014 11:51 am
by sinaowolabi
Thanks I kept the script in both the nagios server and the NRPE client host, and its being run on the client. It works now, but I want to make sure I am doing the right thing. I would love to be able to replicate this with other events I need to manage (eg restarting a StrongSWAN VPN).
If you could kindly review and correct me, I'd be very grateful.

On the server (commands.cfg):
##ipactl restart command
define command{
command_name ipactl_restart
#command_line $USER1$/check_nrpe -n -H $HOSTADDRESS$ -c /usr/lib64/nagios/plugins/eventhandlers/ipactl_restart.sh -a $ARG1$
command_line $USER1$/eventhandlers/ipactl_restart.sh $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTADDRESS$
}

(host.cfg):
define service {
use generic-service
hostgroup_name IPAServers
service_description IPA Directory Service
check_command check_nrpe!check_ipa389
max_check_attempts 3
check_interval 5
retry_interval 1
check_period 24x7
notification_interval 60
notification_period 24x7
contacts sinaadmin
event_handler_enabled 1
event_handler check_nrpe!ipactl_restart
}

On the NRPE client:
command[ipactl_restart]=/usr/lib64/nagios/plugins/eventhandlers/ipactl_restart.sh

Re: Help with Nagios and NRPE Event Handlers

Posted: Tue Apr 22, 2014 2:53 pm
by abrist
Does you script log into the remote host or use nrpe? I ask because the usual way this is configured is to call check_nrpe in the nagios command, pass it the nrpe command name and any args. The command will then run check_nrpe to the remote host and then execute the remote command. Currently it looks like you are just running it against the nagios server. I would configure this check in the following way:
On the NRPE client:

Code: Select all

command[ipactl_restart]=/usr/lib64/nagios/plugins/eventhandlers/ipactl_restart.sh $ARG1$
And then the nagios configuration:

Code: Select all

##ipactl restart command
define command{
    command_name ipactl_restart
    command_line $USER1$/check_nrpe -n -H $HOSTADDRESS$ -c ipactl_restart -a '$SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTADDRESS$'
}

Re: Help with Nagios and NRPE Event Handlers

Posted: Wed Apr 23, 2014 1:33 pm
by sinaowolabi
Thanks! I'll try this and report. I think I tried this way before and for some reason it didn't work.

Re: Help with Nagios and NRPE Event Handlers

Posted: Wed Apr 23, 2014 1:37 pm
by sinaowolabi
The script uses nrpe to run. No logging into the remote machine. But if it had to login remotely, please how would this work?