Page 1 of 1
Event Handler problem
Posted: Thu Mar 28, 2019 2:42 pm
by gixxx11
I'm trying to configure the event handler to trigger on an HTTP service check with a result of warning or critical.
I've set "event_handler_DisableLoadBalancer" as the event handler for the service, and set it to "on".
The content of the event_handler_DisableLoadBalancer is "$USER1$/event_handler_DisableLoadBalancer.sh $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$"
The contents of the event_handler_DisableLoadBalancer.sh is:
Code: Select all
#!/bin/sh
#
# Event handler script for DisableLoadBalancer on testoe2016-1.iss.inter-state.com
# What state is the HTTP service in?
case "$1" in
WARNING)
/usr/local/nagios/libexec/check_nrpe -H testoe2016-1.iss.inter-state.com -p 5666 -c DisableLoadBalancer -a spooler
exit 0
The command inside event_handler_DisableLoadBalancer.sh "/usr/local/nagios/libexec/check_nrpe -H testoe2016-1.iss.inter-state.com -p 5666 -c DisableLoadBalancer -a spooler" works exactly as I want it to when pasted into the terminal on my Nagios.
As you can see the code in event_handler_DisableLoadBalancer.sh references the specific machine (testoe2016-1.iss.inter-state.com) I want to trigger. I actually would prefer this be more generic and actually reference the hostname of the service so I use this script on any of my many hosts. But since I can't get this simple version to work I'm starting here.
Thanks for the assistance.
Re: Event Handler problem
Posted: Thu Mar 28, 2019 2:51 pm
by scottwilkerson
You have some syntax errors in your shell script..
Try this
Code: Select all
#!/bin/sh
#
# Event handler script for DisableLoadBalancer on testoe2016-1.iss.inter-state.com
# What state is the HTTP service in?
case "$1" in
WARNING)
CRITICAL)
/usr/local/nagios/libexec/check_nrpe -H testoe2016-1.iss.inter-state.com -p 5666 -c DisableLoadBalancer -a spooler
;;
esac
exit 0
If you change your command to
Code: Select all
$USER1$/event_handler_DisableLoadBalancer.sh $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ "$HOSTNAME$"
you could do something like this
Code: Select all
[code]#!/bin/sh
#
# Event handler script for DisableLoadBalancer
# What state is the HTTP service in?
host=$4
case "$1" in
WARNING)
CRITICAL)
/usr/local/nagios/libexec/check_nrpe -H $host -p 5666 -c DisableLoadBalancer -a spooler
;;
esac
exit 0
[/code]
Re: Event Handler problem
Posted: Thu Mar 28, 2019 3:27 pm
by gixxx11
Ok so I used your suggesting and set the code to:
Code: Select all
$USER1$/event_handler_DisableLoadBalancer.sh $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ "$HOSTNAME$"
And set the shell script to:
Code: Select all
#!/bin/sh
#
# Event handler script for DisableLoadBalancer
# What state is the HTTP service in?
host=$4
case "$1" in
WARNING)
CRITICAL)
/usr/local/nagios/libexec/check_nrpe -H $host -p 5666 -c DisableLoadBalancer -a spooler
;;
esac
exit 0
And still nothing. I'm stopping the website manually and then forcing a check in nagios. The service goes from UP to WARNING and then nothing.
I took screenshots of each just in case it's useful:
https://www.dropbox.com/sh/ilibjcrslbk1 ... ByHza?dl=0
Re: Event Handler problem
Posted: Thu Mar 28, 2019 3:57 pm
by scottwilkerson
It's hard to tell from the screenshot but is their a " after $HOSTNAME$ ?
Also, can you run the test again aftrung running the following command to see if the event handler is being triggered and if there are any errors?
Code: Select all
tail -f /usr/local/nagios/var/nagios.log
You may also want to try running the command manually from the CLI
Code: Select all
/usr/local/nagios/libexec/event_handler_DisableLoadBalancer.sh WARNING SOFT 1 "testoe2016-1.iss.inter-state.com"
Re: Event Handler problem
Posted: Thu Mar 28, 2019 4:19 pm
by gixxx11
The nagios log says:
Code: Select all
[1553807770] wproc: stderr line 01: execvp(/usr/local/nagios/libexec/event_handler_DisableLoadBalancer.sh, ...) failed. errno is 13: Permission denied
When I manually run the command I get:
Code: Select all
-bash: /usr/local/nagios/libexec/event_handler_DisableLoadBalancer.sh: Permission denied
So, shot in the dark, I'm guessing it's a permissions issue...
How do I fix it? Thank you!
Re: Event Handler problem
Posted: Thu Mar 28, 2019 4:27 pm
by scottwilkerson
Code: Select all
chmod +x /usr/local/nagios/libexec/event_handler_DisableLoadBalancer.sh
One caveat I will mention when using the version with the hostname in there is that your hostname's configured in nagios MUST be the actual hostname you want the script to use
Re: Event Handler problem
Posted: Thu Mar 28, 2019 4:34 pm
by gixxx11
Thanks for the head's up on the hostnames.
After the permissions change I got this:
Code: Select all
/usr/local/nagios/libexec/event_handler_DisableLoadBalancer.sh: line 10: syntax error near unexpected token `)'
/usr/local/nagios/libexec/event_handler_DisableLoadBalancer.sh: line 10: ` CRITICAL)'
This is the current state of that code:
Code: Select all
#!/bin/sh
#
# Event handler script for DisableLoadBalancer
# What state is the HTTP service in?
host=$4
case "$1" in
WARNING)
CRITICAL)
/usr/local/nagios/libexec/check_nrpe -H $host -p 5666 -c DisableLoadBalancer -a spooler
;;
esac
exit 0
Re: Event Handler problem
Posted: Thu Mar 28, 2019 4:50 pm
by ssax
Try this:
Code: Select all
#!/bin/sh
#
# Event handler script for DisableLoadBalancer
# What state is the HTTP service in?
host="$4"
case "$1" in
WARNING)
;&
CRITICAL)
/usr/local/nagios/libexec/check_nrpe -H $host -p 5666 -c DisableLoadBalancer -a spooler
;;
esac
Test:
Code: Select all
./scriptname.sh WARNING blah blah localhost
./scriptname.sh CRITICAL blah blah localhost
Re: Event Handler problem
Posted: Thu Mar 28, 2019 4:56 pm
by gixxx11
Perfection!
Thank you so very much. Everything is working like I want (for this one specific service at least). I tried critical and warning, both soft and hard and they all worked.
Thank you so very much!
Re: Event Handler problem
Posted: Thu Mar 28, 2019 5:01 pm
by ssax
Or, you could do like this:
Code: Select all
#!/bin/sh
#
# Event handler script for restarting the web server on the local machine
#
# Note: This script will only restart the web server if the service is
# retried 3 times (in a "soft" state) or if the web service somehow
# manages to fall into a "hard" error state.
#
# $USER1$/event_handler_DisableLoadBalancer.sh $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ "$HOSTNAME$"
# /usr/local/nagios/libexec/event_handler_DisableLoadBalancer.sh WARNING SOFT 1 "testoe2016-1.iss.inter-state.com"
#
STATE="$1"
STATETYPE="$2"
ATTEMPT="$3"
HOST="$4"
# What state is the HTTP service in?
case "$STATE" in
OK)
# The service just came back up, so don't do anything...
;;
UNKNOWN)
# We don't know what might be causing an unknown error, so don't do anything...
;;
WARNING)
# Warning, restart...
;&
CRITICAL)
# Aha! The HTTP service appears to have a problem - perhaps we should restart the server...
# Is this a "soft" or a "hard" state?
case "$STATETYPE" in
# We're in a "soft" state, meaning that Nagios is in the middle of retrying the
# check before it turns into a "hard" state and contacts get notified...
SOFT)
# What check attempt are we on? We don't want to restart the web server on the first
# check, because it may just be a fluke!
case "$ATTEMPT" in
# Wait until the check has been tried 3 times before restarting the web server.
# If the check fails on the 4th time (after we restart the web server), the state
# type will turn to "hard" and contacts will be notified of the problem.
# Hopefully this will restart the web server successfully, so the 4th check will
# result in a "soft" recovery. If that happens no one gets notified because we
# fixed the problem!
3)
echo -n "Restarting HTTP service (3rd soft critical state)..."
# Call the init script to restart the HTTPD server
/etc/rc.d/init.d/httpd restart
;;
esac
;;
# The HTTP service somehow managed to turn into a hard error without getting fixed.
# It should have been restarted by the code above, but for some reason it didn't.
# Let's give it one last try, shall we?
# Note: Contacts have already been notified of a problem with the service at this
# point (unless you disabled notifications for this service)
HARD)
echo -n "Restarting service..."
# Retart it through NRPE
/usr/local/nagios/libexec/check_nrpe -H $HOST -p 5666 -c DisableLoadBalancer -a spooler
;;
esac
;;
esac
exit 0