Page 1 of 1
Event Handlers strange behaviour
Posted: Mon May 13, 2013 8:07 am
by TSCAdmin
Hi,
We are using Nagios XI 2009R1.3 on CentOS release 5.4 (Final). On a bunch of hosts we want to notify the server owners if /tmp partition is over a threshold limit via e-mail. To achieve this I was implementing event handlers. This is how my service definition looks like:
Code: Select all
define service {
host_name LinuxServer
service_description Disk Monitor /tmp
use xiwizard_linuxserver_disk_service
check_command check_snmp_storage_custom!community!20!50!/tmp
max_check_attempts 5
check_interval 10
retry_interval 1
event_handler event_handler_tmp_directory_listing
event_handler_enabled 1
flap_detection_enabled 0
notification_options w,u,r,c
contacts linux-admin
register 1
}
I have event_handler_tmp_directory_listing defined in the commands.cfg. These are first few lines of the event handler script:
Code: Select all
#!/bin/bash
# event handler script
# to get and e-mail the listing of /tmp directory
echo "$1 $2 $3 $4 $5" >> /usr/local/nagios/libexec/eventhandlers/inputs
For testing purpose I filled the /tmp directory on the monitored host (LinuxServer), the problem is that the event handler script is only called when the service returns to OK state. Here are the contents of inputs file:
Code: Select all
OK HARD 5 LinuxServer OK
OK SOFT 3 LinuxServer OK
OK SOFT 3 LinuxServer OK
OK HARD 5 LinuxServer OK
OK HARD 5 LinuxServer OK
I'm not sure why it is not being executed when the service goes in WARNING|CRITICAL state. Is there something missing?
Thanks
Re: Event Handlers strange behaviour
Posted: Mon May 13, 2013 11:08 am
by abrist
Does your event handler script include logic for the PROBLEM-STATE?
See the bottom of the following document:
http://nagios.sourceforge.net/docs/3_0/ ... dlers.html
Re: Event Handlers strange behaviour
Posted: Mon May 13, 2013 12:47 pm
by TSCAdmin
Hi,
Yes it does include the logic of problem state. Here is how the command has been defined:
Code: Select all
define command{
command_name event_handler_tmp_directory_listing
command_line /usr/local/nagios/libexec/eventhandlers/event_handler_tmp_directory_listing.sh $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTNAME$ $SERVICEOUTPUT$
}
For some reasons, the event_handler is called only when the service returns to OK state, it does not execute if it enters WARNING or CRITICAL state. For the troubleshooting purpose I added these this line at the top of my script:
Code: Select all
#!/bin/bash
# event handler script
# to get and e-mail the listing of /tmp directory
[b]echo "$1 $2 $3 $4 $5" >> /usr/local/nagios/libexec/eventhandlers/inputs[/b]
now if my understanding is correct this will execute every time event handler is called. I was wondering if event_handler also require something like: "[w,u,c,r,f,s]" options to be executed?
In other words what conditions needs to meet for event handlers to execute? How can I ensure that event handler executed - logs or something?
Thanks
Re: Event Handlers strange behaviour
Posted: Mon May 13, 2013 4:55 pm
by abrist
Lets get some more information into the logs. Change the following line in the file
/usr/local/nagios/etc/nagios.cfg:
To:
Restart Nagios:
Tail the /usr/local/nagios/var/nagios.log file for any line pertaining to event handlers and then force something into a failed state:
Code: Select all
tail -f /usr/local/nagios/var/nagios.log
Re: Event Handlers strange behaviour
Posted: Tue May 14, 2013 6:06 am
by TSCAdmin
Hi,
I enabled the log_event_handlers in nagios.cfg. Here are the complete details:
event handler definition:
Code: Select all
define command {
command_name xi_tmp_dir_event_handler
command_line /usr/local/nagios/libexec/eventhandlers/tmp_dir_event_handler.sh $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTADDRESS$ $SERVICEOUTPUT$
}
event handler script:
Code: Select all
#!/bin/bash
now=$(date +%s)
echo "[$now] $1 $2 $3 $4 $5" >> /usr/local/nagios/libexec/eventhandlers/new
service definition:
Code: Select all
define service {
host_name gb-doc-svb-0302
service_description Disk Monitor /home
use xiwizard_linuxserver_disk_service
check_command check_snmp_storage_custom!dhMonitor!40!70!/home
max_check_attempts 3
check_interval 10
retry_interval 1
check_period 24x7
event_handler xi_tmp_dir_event_handler
event_handler_enabled 1
flap_detection_enabled 0
notification_interval 60
notification_period 24x7
notification_options w,u,r,c
contacts ashishkumar
_xiwizard dh_linux_server
register 1
}
Here are the results:
Problem detected, WARNING - SOFT state 1
Code: Select all
[1368528937] SERVICE ALERT: gb-doc-svb-0302;Disk Monitor /home;WARNING;SOFT;1;WARNING : /home: 48%used(4806MB/9919MB) : > 40 %
[1368528937] GLOBAL SERVICE EVENT HANDLER: gb-doc-svb-0302;Disk Monitor /home;WARNING;SOFT;1;xi_service_event_handler
[1368528937] SERVICE EVENT HANDLER: gb-doc-svb-0302;Disk Monitor /home;WARNING;SOFT;1;xi_tmp_dir_event_handler
WARNING - SOFT state 2
Code: Select all
[1368528997] SERVICE ALERT: gb-doc-svb-0302;Disk Monitor /home;WARNING;SOFT;2;WARNING : /home: 48%used(4806MB/9919MB) : > 40 %
[1368528997] GLOBAL SERVICE EVENT HANDLER: gb-doc-svb-0302;Disk Monitor /home;WARNING;SOFT;2;xi_service_event_handler
[1368528997] SERVICE EVENT HANDLER: gb-doc-svb-0302;Disk Monitor /home;WARNING;SOFT;2;xi_tmp_dir_event_handler
WARNING - HARD state
Code: Select all
[1368529057] SERVICE ALERT: gb-doc-svb-0302;Disk Monitor /home;WARNING;HARD;3;WARNING : /home: 48%used(4806MB/9919MB) : > 40 %
[1368529057] SERVICE NOTIFICATION: ashishkumar;gb-doc-svb-0302;Disk Monitor /home;WARNING;notify-service-by-email;WARNING : /home: 48%used(4806MB/9919MB) : 40 %
[1368529057] GLOBAL SERVICE EVENT HANDLER: gb-doc-svb-0302;Disk Monitor /home;WARNING;HARD;3;xi_service_event_handler
[1368529057] SERVICE EVENT HANDLER: gb-doc-svb-0302;Disk Monitor /home;WARNING;HARD;3;xi_tmp_dir_event_handler
OK state
Code: Select all
[1368529099] SERVICE ALERT: gb-doc-svb-0302;Disk Monitor /home;OK;HARD;3;OK : : < 40 %
[1368529099] SERVICE NOTIFICATION: ashishkumar;gb-doc-svb-0302;Disk Monitor /home;OK;notify-service-by-email;OK : : 40 %
[1368529099] GLOBAL SERVICE EVENT HANDLER: gb-doc-svb-0302;Disk Monitor /home;OK;HARD;3;xi_service_event_handler
[1368529099] SERVICE EVENT HANDLER: gb-doc-svb-0302;Disk Monitor /home;OK;HARD;3;xi_tmp_dir_event_handler
event handler file contents
Code: Select all
$ cat /usr/local/nagios/libexec/eventhandlers/new
[1368529099] OK HARD 3 gb-doc-svb-0302 OK
It seems event handler
xi_tmp_dir_event_handler was called at every step but it actually executed and displayed results only when the service returned to OK state.
Please let me know if more information is required to investigate this further.
Thanks
Re: Event Handlers strange behaviour
Posted: Tue May 14, 2013 4:49 pm
by abrist
I am still digging on this one. What are you using the global event handler for? Could there be a conflict (writing to the same file, etc)?
Re: Event Handlers strange behaviour
Posted: Mon May 20, 2013 3:11 pm
by TSCAdmin
Hi,
I think I have cracked it! It was a mistake at my end, apologies for the trouble.
Everything was good and working except the CRITICAL/WARNING messages. The only catch was to quote the final argument, $SERVICEOUTPUT$", that were being passed to the event handler script and boooooooom!
Code: Select all
define command {
command_name xi_tmp_dir_event_handler
command_line /usr/local/nagios/libexec/eventhandlers/tmp_dir_event_handler.sh $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTADDRESS$ "$SERVICEOUTPUT$"
}
Thanks.
Re: Event Handlers strange behaviour
Posted: Mon May 20, 2013 3:19 pm
by slansing
Ahh! So it must have been lopping off that output when it was unconstrained by quotes.. interesting, thanks for the heads up and the find!