Suppress Notification of UP after UNREACHABLE (FIXED!)
Suppress Notification of UP after UNREACHABLE (FIXED!)
Good day,
I'm trying to find out if anyone has come up with a way to suppress UP notification for a HOST after it has been in an UNREACHABLE state.
Now that I think about it, it would also be nice to get the DOWN notifications after UNREACHABLE. Thus limiting the number of notifications after and outage/recovery scenario and possibly getting quicker problem notifications for children further down the tree.
R
I'm trying to find out if anyone has come up with a way to suppress UP notification for a HOST after it has been in an UNREACHABLE state.
Now that I think about it, it would also be nice to get the DOWN notifications after UNREACHABLE. Thus limiting the number of notifications after and outage/recovery scenario and possibly getting quicker problem notifications for children further down the tree.
R
Last edited by SRSchnell on Wed Sep 07, 2016 2:34 pm, edited 1 time in total.
Re: Suppress Notification of UP after UNREACHABLE
You would need to write a notification handler script and pass in the $LASTHOSTSTATE$ macro, then if current state $HOSTSTATE$ = 'UP', exit.
That should get you going.
Code: Select all
define command {
command_name notify-host-by-email
command_line /usr/local/nagios/libexec/host_notification_handler.sh $NOTIFICATIONTYPE$ $HOSTNAME$ $HOSTSTATE$ $LASTHOSTSTATE$ $HOSTADDRESS$ $HOSTOUTPUT$ $LONGDATETIME$ $CONTACTEMAIL$
}Code: Select all
#!/bin/bash
# /usr/local/nagios/libexec/host_notification_handler.sh
NOTIFICATIONTYPE="$1"
HOSTNAME="$2"
HOSTSTATE="$3"
LASTHOSTSTATE="$4"
HOSTADDRESS="$5"
HOSTOUTPUT="$6"
LONGDATETIME="$7"
CONTACTEMAIL="$8"
if [ "$LASTHOSTSTATE" = "UNREACHABLE" ] && [ "$HOSTSTATE" = "UP" ]; then
exit 0
fi
/usr/bin/printf "%b" "***** Nagios Monitor XI Alert *****\n\nNotification Type: $NOTIFICATIONTYPE\nHost: $HOSTNAME\nState: $HOSTSTATE\nAddress: $HOSTADDRESS\nInfo: $HOSTOUTPUT\n\nDate/Time: $LONGDATETIME\n" | /bin/mail -s "** $NOTIFICATIONTYPE Host Alert: $HOSTNAME is $HOSTSTATE **" $CONTACTEMAIL
exit 0
Re: Suppress Notification of UP after UNREACHABLE
Thanks ssax,
It mostly works. I added debug commands to find out what wasn't working. Is there any reason it won't seem to pass $CONTACTEMAIL$?
R
It mostly works. I added debug commands to find out what wasn't working. Is there any reason it won't seem to pass $CONTACTEMAIL$?
R
Re: Suppress Notification of UP after UNREACHABLE
Hmm, maybe it doesn't like the @ symbol, try this:
Let me know the results.
Code: Select all
define command {
command_name notify-host-by-email
command_line /usr/local/nagios/libexec/host_notification_handler.sh $NOTIFICATIONTYPE$ $HOSTNAME$ $HOSTSTATE$ $LASTHOSTSTATE$ $HOSTADDRESS$ $HOSTOUTPUT$ $LONGDATETIME$ '$CONTACTEMAIL$'
}Code: Select all
#!/bin/bash
# /usr/local/nagios/libexec/host_notification_handler.sh
NOTIFICATIONTYPE="$1"
HOSTNAME="$2"
HOSTSTATE="$3"
LASTHOSTSTATE="$4"
HOSTADDRESS="$5"
HOSTOUTPUT="$6"
LONGDATETIME="$7"
CONTACTEMAIL="$8"
if [ "$LASTHOSTSTATE" = "UNREACHABLE" ] && [ "$HOSTSTATE" = "UP" ]; then
exit 0
fi
/usr/bin/printf "%b" "***** Nagios Monitor XI Alert *****\n\nNotification Type: $NOTIFICATIONTYPE\nHost: $HOSTNAME\nState: $HOSTSTATE\nAddress: $HOSTADDRESS\nInfo: $HOSTOUTPUT\n\nDate/Time: $LONGDATETIME\n" | /bin/mail -s "** $NOTIFICATIONTYPE Host Alert: $HOSTNAME is $HOSTSTATE **" '$CONTACTEMAIL'
exit 0
Re: Suppress Notification of UP after UNREACHABLE
I get the following error in the nagios.log when things go down;
[1473169558] HOST NOTIFICATION: markw;test.switch;DOWN;notify-host-by-email-2;CRITICAL - Host Unreachable (192.168.0.10)
[1473169558] wproc: NOTIFY job 4 from worker Core Worker 21777 is a non-check helper but exited with return code 2
[1473169558] wproc: host=test.switch; service=(none); contact=markw
[1473169558] wproc: early_timeout=0; exited_ok=1; wait_status=512; error_code=0;
[1473169558] wproc: stderr line 01: /bin/sh: 1: Syntax error: "(" unexpected
and this upon recovery
8: NOTIFICATIONTYPE=RECOVERY
9: HOSTNAME=test.voip
10: HOSTSTATE=UP
11: LASTHOSTSTATE=UNREACHABLE
12: HOSTADDRESS=192.168.0.11
13: HOSTOUTPUT=PING
14: LONGDATETIME=OK
16: CONTACTEMAIL=-
These are the code snippets I'm using:
[1473169558] HOST NOTIFICATION: markw;test.switch;DOWN;notify-host-by-email-2;CRITICAL - Host Unreachable (192.168.0.10)
[1473169558] wproc: NOTIFY job 4 from worker Core Worker 21777 is a non-check helper but exited with return code 2
[1473169558] wproc: host=test.switch; service=(none); contact=markw
[1473169558] wproc: early_timeout=0; exited_ok=1; wait_status=512; error_code=0;
[1473169558] wproc: stderr line 01: /bin/sh: 1: Syntax error: "(" unexpected
and this upon recovery
8: NOTIFICATIONTYPE=RECOVERY
9: HOSTNAME=test.voip
10: HOSTSTATE=UP
11: LASTHOSTSTATE=UNREACHABLE
12: HOSTADDRESS=192.168.0.11
13: HOSTOUTPUT=PING
14: LONGDATETIME=OK
16: CONTACTEMAIL=-
These are the code snippets I'm using:
Code: Select all
# 'notify-host-by-email-2' command definition
define command {
command_name notify-host-by-email-2
command_line /home/rod/host_notification_handler.sh $NOTIFICATIONTYPE$ $HOSTNAME$ $HOSTSTATE$ $LASTHOSTSTATE$ $HOSTADDRESS$ $HOSTOUTPUT$ $LONGDATETIME$ '$CONTACTEMAIL$' >>/home/rod/debug-nagios.txt
}Code: Select all
#!/bin/bash
# /usr/local/nagios/libexec/host_notification_handler.sh
exec 5> /home/rod/nagios-debug.txt
BASH_XTRACEFD="5"
PS4='$LINENO: '
set -x
NOTIFICATIONTYPE="$1"
HOSTNAME="$2"
HOSTSTATE="$3"
LASTHOSTSTATE="$4"
HOSTADDRESS="$5"
HOSTOUTPUT="$6"
LONGDATETIME="$7"
CONTACTEMAIL="$8"
echo $NOTIFICATIONTYPE
echo $HOSTNAME
echo $HOSTSTATE
echo $LASTHOSTSTATE
echo $HOSTADDRESS
echo $HOSTOUTPUT
echo $LONGDATETIME
echo $CONTACTEMAIL
if [ "$LASTHOSTSTATE" = "UNREACHABLE" ] && [ "$HOSTSTATE" = "UP" ]; then
exit 0
fi
/usr/bin/printf "%b" "Subject: $NOTIFICATIONTYPE\n Service Alert: $HOSTALIAS/$SERVICEDESC is $SERVICESTATE\n\n***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE\n\nService: $SERVICEDESC\nHost: $HOSTALIAS\nAddress: $HOSTADDRESS\nState: $SERVICESTATE\n\nDate/Time: $LONGDATETIME\n\nAdditional Info: \n\n$SERVICEOUTPUT\n" | /usr/sbin/sendmail -vt '$CONTACTEMAIL'
exit 0Re: Suppress Notification of UP after UNREACHABLE
Could you post the contact definition for 'markw' for us to look at?
Former Nagios Employee
Re: Suppress Notification of UP after UNREACHABLE
Code: Select all
# Mark W
define contact{
contact_name markw ; Short name of user
use generic-contact ; Inherit default values from generic-contact template
alias Mark W ; Full name of user
email [email protected] ; email address
host_notifications_enabled 1
# host_notification_options d,r
host_notification_commands notify-host-by-email-2
}Re: Suppress Notification of UP after UNREACHABLE
Try changing the notify-host-by-email-2 command from
to
Restart Nagios and see if that fixes it.
Some of the output to the macros will have spaces and other characters that could be causing the problem, and the double quotes will fix that.
If not, can you post your generic-contact template?
Code: Select all
# 'notify-host-by-email-2' command definition
define command {
command_name notify-host-by-email-2
command_line /home/rod/host_notification_handler.sh $NOTIFICATIONTYPE$ $HOSTNAME$ $HOSTSTATE$ $LASTHOSTSTATE$ $HOSTADDRESS$ $HOSTOUTPUT$ $LONGDATETIME$ '$CONTACTEMAIL$' >>/home/rod/debug-nagios.txt
}Code: Select all
# 'notify-host-by-email-2' command definition
define command {
command_name notify-host-by-email-2
command_line /home/rod/host_notification_handler.sh $NOTIFICATIONTYPE$ "$HOSTNAME$" $HOSTSTATE$ $LASTHOSTSTATE$ "$HOSTADDRESS$" "$HOSTOUTPUT$" "$LONGDATETIME$" "$CONTACTEMAIL$" >>/home/rod/debug-nagios.txt
}Some of the output to the macros will have spaces and other characters that could be causing the problem, and the double quotes will fix that.
If not, can you post your generic-contact template?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Suppress Notification of UP after UNREACHABLE
That fixed the problem (other than formatting errors on my part
)!
Can you expand upon the explanation of the difference between single quotes and double quotes?
Thanks for all the help!
Rod
Can you expand upon the explanation of the difference between single quotes and double quotes?
Thanks for all the help!
Rod
Re: Suppress Notification of UP after UNREACHABLE (FIXED!)
It's likely that there was some escaping that needed to be done at the CLI level that single quotes was not able to tackle.
Former Nagios employee
https://www.mcapra.com/
https://www.mcapra.com/