Page 1 of 2

Suppress Notification of UP after UNREACHABLE (FIXED!)

Posted: Wed Aug 31, 2016 8:42 am
by SRSchnell
Good day,

I'm trying to find out if anyone has come up with a way to suppress UP notification for a HOST after it has been in an UNREACHABLE state.

Now that I think about it, it would also be nice to get the DOWN notifications after UNREACHABLE. Thus limiting the number of notifications after and outage/recovery scenario and possibly getting quicker problem notifications for children further down the tree.

R

Re: Suppress Notification of UP after UNREACHABLE

Posted: Wed Aug 31, 2016 2:56 pm
by ssax
You would need to write a notification handler script and pass in the $LASTHOSTSTATE$ macro, then if current state $HOSTSTATE$ = 'UP', exit.

Code: Select all

define command {
       command_name                  		notify-host-by-email
       command_line                  		        /usr/local/nagios/libexec/host_notification_handler.sh $NOTIFICATIONTYPE$ $HOSTNAME$ $HOSTSTATE$ $LASTHOSTSTATE$ $HOSTADDRESS$ $HOSTOUTPUT$ $LONGDATETIME$ $CONTACTEMAIL$
}

Code: Select all

#!/bin/bash
# /usr/local/nagios/libexec/host_notification_handler.sh

NOTIFICATIONTYPE="$1"
HOSTNAME="$2"
HOSTSTATE="$3"
LASTHOSTSTATE="$4"
HOSTADDRESS="$5"
HOSTOUTPUT="$6"
LONGDATETIME="$7"
CONTACTEMAIL="$8"

if [ "$LASTHOSTSTATE" = "UNREACHABLE" ] && [ "$HOSTSTATE" = "UP" ]; then
    exit 0
fi

/usr/bin/printf "%b" "***** Nagios Monitor XI Alert *****\n\nNotification Type: $NOTIFICATIONTYPE\nHost: $HOSTNAME\nState: $HOSTSTATE\nAddress: $HOSTADDRESS\nInfo: $HOSTOUTPUT\n\nDate/Time: $LONGDATETIME\n" | /bin/mail -s "** $NOTIFICATIONTYPE Host Alert: $HOSTNAME is $HOSTSTATE **" $CONTACTEMAIL

exit 0
That should get you going.

Re: Suppress Notification of UP after UNREACHABLE

Posted: Thu Sep 01, 2016 4:58 pm
by SRSchnell
Thanks ssax,

It mostly works. I added debug commands to find out what wasn't working. Is there any reason it won't seem to pass $CONTACTEMAIL$?

R

Re: Suppress Notification of UP after UNREACHABLE

Posted: Fri Sep 02, 2016 11:22 am
by ssax
Hmm, maybe it doesn't like the @ symbol, try this:

Code: Select all

define command {
       command_name                  		notify-host-by-email
       command_line                  		        /usr/local/nagios/libexec/host_notification_handler.sh $NOTIFICATIONTYPE$ $HOSTNAME$ $HOSTSTATE$ $LASTHOSTSTATE$ $HOSTADDRESS$ $HOSTOUTPUT$ $LONGDATETIME$ '$CONTACTEMAIL$'
}

Code: Select all

#!/bin/bash
# /usr/local/nagios/libexec/host_notification_handler.sh

NOTIFICATIONTYPE="$1"
HOSTNAME="$2"
HOSTSTATE="$3"
LASTHOSTSTATE="$4"
HOSTADDRESS="$5"
HOSTOUTPUT="$6"
LONGDATETIME="$7"
CONTACTEMAIL="$8"

if [ "$LASTHOSTSTATE" = "UNREACHABLE" ] && [ "$HOSTSTATE" = "UP" ]; then
    exit 0
fi

/usr/bin/printf "%b" "***** Nagios Monitor XI Alert *****\n\nNotification Type: $NOTIFICATIONTYPE\nHost: $HOSTNAME\nState: $HOSTSTATE\nAddress: $HOSTADDRESS\nInfo: $HOSTOUTPUT\n\nDate/Time: $LONGDATETIME\n" | /bin/mail -s "** $NOTIFICATIONTYPE Host Alert: $HOSTNAME is $HOSTSTATE **" '$CONTACTEMAIL'

exit 0
Let me know the results.

Re: Suppress Notification of UP after UNREACHABLE

Posted: Tue Sep 06, 2016 8:56 am
by SRSchnell
I get the following error in the nagios.log when things go down;

[1473169558] HOST NOTIFICATION: markw;test.switch;DOWN;notify-host-by-email-2;CRITICAL - Host Unreachable (192.168.0.10)
[1473169558] wproc: NOTIFY job 4 from worker Core Worker 21777 is a non-check helper but exited with return code 2
[1473169558] wproc: host=test.switch; service=(none); contact=markw
[1473169558] wproc: early_timeout=0; exited_ok=1; wait_status=512; error_code=0;
[1473169558] wproc: stderr line 01: /bin/sh: 1: Syntax error: "(" unexpected

and this upon recovery
8: NOTIFICATIONTYPE=RECOVERY
9: HOSTNAME=test.voip
10: HOSTSTATE=UP
11: LASTHOSTSTATE=UNREACHABLE
12: HOSTADDRESS=192.168.0.11
13: HOSTOUTPUT=PING
14: LONGDATETIME=OK
16: CONTACTEMAIL=-

These are the code snippets I'm using:

Code: Select all

# 'notify-host-by-email-2' command definition
define command {
        command_name    notify-host-by-email-2
        command_line    /home/rod/host_notification_handler.sh $NOTIFICATIONTYPE$ $HOSTNAME$ $HOSTSTATE$ $LASTHOSTSTATE$ $HOSTADDRESS$ $HOSTOUTPUT$ $LONGDATETIME$ '$CONTACTEMAIL$' >>/home/rod/debug-nagios.txt
}

Code: Select all

#!/bin/bash
# /usr/local/nagios/libexec/host_notification_handler.sh
exec 5> /home/rod/nagios-debug.txt
BASH_XTRACEFD="5"
PS4='$LINENO: '
set -x

NOTIFICATIONTYPE="$1"
HOSTNAME="$2"
HOSTSTATE="$3"
LASTHOSTSTATE="$4"
HOSTADDRESS="$5"
HOSTOUTPUT="$6"
LONGDATETIME="$7"
CONTACTEMAIL="$8"

echo $NOTIFICATIONTYPE
echo $HOSTNAME
echo $HOSTSTATE
echo $LASTHOSTSTATE
echo $HOSTADDRESS
echo $HOSTOUTPUT
echo $LONGDATETIME
echo $CONTACTEMAIL

if [ "$LASTHOSTSTATE" = "UNREACHABLE" ] && [ "$HOSTSTATE" = "UP" ]; then
    exit 0
fi

/usr/bin/printf "%b" "Subject: $NOTIFICATIONTYPE\n Service Alert: $HOSTALIAS/$SERVICEDESC is $SERVICESTATE\n\n***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE\n\nService: $SERVICEDESC\nHost: $HOSTALIAS\nAddress: $HOSTADDRESS\nState: $SERVICESTATE\n\nDate/Time: $LONGDATETIME\n\nAdditional Info: \n\n$SERVICEOUTPUT\n" | /usr/sbin/sendmail -vt '$CONTACTEMAIL'


exit 0

Re: Suppress Notification of UP after UNREACHABLE

Posted: Tue Sep 06, 2016 2:35 pm
by rkennedy
Could you post the contact definition for 'markw' for us to look at?

Re: Suppress Notification of UP after UNREACHABLE

Posted: Tue Sep 06, 2016 3:28 pm
by SRSchnell

Code: Select all

# Mark W
define contact{
        contact_name                         markw                      ; Short name of user
        use                                  generic-contact            ; Inherit default values from generic-contact template
        alias                                Mark W                     ; Full name of user
        email                                [email protected]  ; email address
        host_notifications_enabled           1
#        host_notification_options            d,r
        host_notification_commands           notify-host-by-email-2
        }

Re: Suppress Notification of UP after UNREACHABLE

Posted: Wed Sep 07, 2016 1:27 pm
by tgriep
Try changing the notify-host-by-email-2 command from

Code: Select all

# 'notify-host-by-email-2' command definition
define command {
        command_name    notify-host-by-email-2
        command_line    /home/rod/host_notification_handler.sh $NOTIFICATIONTYPE$ $HOSTNAME$ $HOSTSTATE$ $LASTHOSTSTATE$ $HOSTADDRESS$ $HOSTOUTPUT$ $LONGDATETIME$ '$CONTACTEMAIL$' >>/home/rod/debug-nagios.txt
}
to

Code: Select all

# 'notify-host-by-email-2' command definition
define command {
        command_name    notify-host-by-email-2
        command_line    /home/rod/host_notification_handler.sh $NOTIFICATIONTYPE$ "$HOSTNAME$" $HOSTSTATE$ $LASTHOSTSTATE$ "$HOSTADDRESS$" "$HOSTOUTPUT$" "$LONGDATETIME$" "$CONTACTEMAIL$" >>/home/rod/debug-nagios.txt
}
Restart Nagios and see if that fixes it.
Some of the output to the macros will have spaces and other characters that could be causing the problem, and the double quotes will fix that.

If not, can you post your generic-contact template?

Re: Suppress Notification of UP after UNREACHABLE

Posted: Wed Sep 07, 2016 2:33 pm
by SRSchnell
That fixed the problem (other than formatting errors on my part ;) )!

Can you expand upon the explanation of the difference between single quotes and double quotes?

Thanks for all the help!

Rod

Re: Suppress Notification of UP after UNREACHABLE (FIXED!)

Posted: Wed Sep 07, 2016 3:20 pm
by mcapra
It's likely that there was some escaping that needed to be done at the CLI level that single quotes was not able to tackle.