Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
[1528896858] SERVICE ALERT: stag-hr-gal-gw-01;LService;CRITICAL;HARD;1;LService is stopped and can be resumed, trying to restart it
[1528896858] SERVICE EVENT HANDLER: stag-hr-gal-gw-01;LService;CRITICAL;HARD;1;resume-hr-LService
[1528896858] wproc: SERVICE EVENTHANDLER job 274 from worker Core Worker 727 is a non-check helper but exited with return code 2
[1528896858] wproc: early_timeout=0; exited_ok=1; wait_status=512; error_code=0;
[1528896858] wproc: stderr line 01: /usr/local/nagios/libexec/eventhandlers/resume-hr-LService: 14: /usr/local/nagios/libexec/eventhandlers/resume-hr-LService: $=CRITICAL: not found
[1528896858] wproc: stderr line 02: /usr/local/nagios/libexec/eventhandlers/resume-hr-LService: 15: /usr/local/nagios/libexec/eventhandlers/resume-hr-:LService: $=HARD: not found
[1528896858] wproc: stderr line 03: /usr/local/nagios/libexec/eventhandlers/resume-hr-LService: 16: /usr/local/nagios/libexec/eventhandlers/resume-hr-LService: $=1: not found
[1528896858] wproc: stderr line 04: /usr/local/nagios/libexec/eventhandlers/resume-hr-LService: 17: /usr/local/nagios/libexec/eventhandlers/resume-hr-LService: $=10.101.22.76: not found
[1528896858] wproc: stderr line 05: /usr/local/nagios/libexec/eventhandlers/resume-hr-LService: 18: /usr/local/nagios/libexec/eventhandlers/resume-hr-LService: $=$: not found
[1528896858] wproc: stderr line 06: /usr/local/nagios/libexec/eventhandlers/resume-hr-LService: 53: /usr/local/nagios/libexec/eventhandlers/resume-hr-LService: Syntax error: end of file unexpected (expecting ";;")
it is as if it can't get the values of the macros passed via event-handlers-commands.cfg
#!/bin/sh
$SERVICESTATE$=$1
$SERVICESTATETYPE$=$2
$SERVICEATTEMPT$=$3
$HOSTADDRESS$=$4
$ISVALIDETIME$=$5
# What state is the service in?
case "$1" in
OK)
# The service just came back up, so don't do anything...
;;
WARNING)
# Warning don't do anything...
;;
CRITICAL)
# Is this a "soft" or a "hard" state?
case "$2" in
HARD)
case "$3" in
1)
if [ $5:hr-s3bt-nm$ = 1 ] ; then
# Trying to resume LService.
/usr/local/nagios/libexec/check_by_ssh -t 45 -H $4 -l nagios -C "sudo /usr/local/bin/start_ls.sh"
fi
;;
esac
;;
UNKNOWN)
# We don't know what might be causing an unknown error, so don't do anything...
;;
esac
exit 0
Could anyone please point me to the right direction?
Thank you
"It is impossible to work in information technology without also engaging in social engineering"
Jaron Lanier
#!/bin/sh
# $SERVICESTATE$=$1
# $SERVICESTATETYPE$=$2
# $SERVICEATTEMPT$=$3
# $HOSTADDRESS$=$4
# $ISVALIDETIME$=$5
# What state is the service in?
case "$1" in
OK)
# The service just came back up, so don't do anything...
;;
WARNING)
# Warning don't do anything...
;;
CRITICAL)
# Is this a "soft" or a "hard" state?
case "$2" in
HARD)
case "$3" in
1)
if [ $5:hr-s3bt-nm$ = 1 ] ; then
# Trying to resume LService.
/usr/local/nagios/libexec/check_by_ssh -t 45 -H $4 -l nagios -C "sudo /usr/local/bin/start_ls.sh"
fi
;;
esac
;;
UNKNOWN)
# We don't know what might be causing an unknown error, so don't do anything...
;;
esac
exit 0
[1528903309] SERVICE EVENT HANDLER: stag-hr-gal-gw-01;LService;CRITICAL;HARD;1;resume-hr-LService
[1528903309] wproc: SERVICE EVENTHANDLER job 50 from worker Core Worker 18693 is a non-check helper but exited with return code 2
[1528903309] wproc: early_timeout=0; exited_ok=1; wait_status=512; error_code=0;
[1528903309] wproc: stderr line 01: /usr/local/nagios/libexec/eventhandlers/resume-hr-LService: 53: /usr/local/nagios/libexec/eventhandlers/resume-hr-LService: Syntax error: end of file unexpected (expecting ";;")
you would need to change this to just $5 you can't add them together like this
Either way it seems like the script needs some debugging, trying to make sure it runs correctly if you just pass values where the macros go
before trying to have nagios execute it.
#!/bin/sh
# $SERVICESTATE$=$1
# $SERVICESTATETYPE$=$2
# $SERVICEATTEMPT$=$3
# $HOSTADDRESS$=$4
# $ISVALIDTIME:hr-s3bt-nm$=$5
# What state is the service in?
case "$1" in
OK)
# The service just came back up, so don't do anything...
;;
WARNING)
# Warning don't do anything...
;;
CRITICAL)
# Is this a "soft" or a "hard" state?
case "$2" in
HARD)
case "$3" in
1)
if [ $5 = 1 ] ; then
# Trying to resume LService.
/usr/local/nagios/libexec/check_by_ssh -t 120 -H $4 -l nagios -C "sudo /usr/local/bin/start_LService.sh"
fi
;;
esac
;;
esac
;;
UNKNOWN)
# We don't know what might be causing an unknown error, so don't do anything...
;;
esac
exit 0
It seems much healthier and I don't have the previous errors in the log any more.
I am waiting to see if all works fine when the service goes down during the night.
Thank you very much for your help.
Please feel free to add anything that could be useful for the use of the event handler using a time-period.
Best Regards
"It is impossible to work in information technology without also engaging in social engineering"
Jaron Lanier
Warning: Service event handler command '/usr/local/nagios/libexec/eventhandlers/resume-hr-LService CRITICAL HARD 1 10.101.22.76 0' timed out after 0.00 seconds
not sure what that means...but it did not work as expected.
Hi Scott,
Thank you...yes I changed the script to accept 0 since I wanted to test during working hours.
Anyway it seems fine and it successfully ran although I need to figuring out how to have it to write in the log when running as an event handler bash wrapper; when manually ran as you suggested the log gets written but not otherwise.
Thank you again for your time and help.
I think we can close this request.
Have a good day!
Regards
"It is impossible to work in information technology without also engaging in social engineering"
Jaron Lanier