No notification on host down

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Tron911
Posts: 7
Joined: Tue Feb 09, 2016 5:44 am

No notification on host down

Post by Tron911 »

Hello, I've got a problem with one of my Nagios Core installations.

I've checked everything (or almost, except the right thing, I assume :D ) but I'm not finding the way...

The problem is that when one particular host goes down, Nagios didn't notify me of the down HARD state.

I'll attach some data...

This is what I get running a /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Code: Select all

Nagios Core 4.0.8
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-12-2014
License: GPL

Website: http://www.nagios.org
Reading configuration data...
   Read main config file okay...
   Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
        Checked 225 services.
        Checked 460 hosts.
        Checked 40 host groups.
        Checked 4 service groups.
        Checked 11 contacts.
        Checked 1 contact groups.
        Checked 44 commands.
        Checked 15 time periods.
        Checked 0 host escalations.
        Checked 0 service escalations.
Checking for circular paths...
        Checked 460 hosts
        Checked 0 service dependencies
        Checked 324 host dependencies
        Checked 15 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check
This is the host template:

Code: Select all

define host{
	name                            ls-host
	hostgroups						Tutti
	notifications_enabled           1
	event_handler_enabled           1
	flap_detection_enabled          0
	process_perf_data               1
	retain_status_information       1
	retain_nonstatus_information    1
	check_period					24x7
	check_interval					5
	retry_interval					2
	max_check_attempts				3
	check_command					check-ls-host-alive
	notification_period				24x7
	notification_interval			0
	notification_options			d,r
	contacts						beppe,lscontact,reperibility-network-day,reperibility-system-day,customercontact
	register                        0
}
This is the host definition of the problematic device (name of host modified):

Code: Select all

define host{
	use ls-host
	host_name XX-YYY-STORAG01
	address 172.16.25.19
	hostgroups 25-XX-YYY,NAS
}
This is the check_command definition:

Code: Select all

define command{
	command_name    check-ls-host-alive
	command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w 1500.0,80% -c 2000.0,100% -p 5 -t 3 -4
}
One of the contact (address modified):

Code: Select all

define contact{
	contact_name					beppe
	alias							Beppe
	service_notification_period     24x7
	host_notification_period        24x7
	service_notification_options    w,u,c,r,f,s
	host_notification_options       d,u,r,f,s
	service_notification_commands   service-mail-noCC
	host_notification_commands      host-mail-noCC
	email							beppe@nonexistentdomain.nix
	}
And the host notification command used:

Code: Select all

define command{
	command_name	host-mail-noCC
	command_line	/usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\n$HOSTALIAS$\n$HOSTNOTES$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n\nTime in state $HOSTSTATE$: $HOSTDURATION$\n" | /bin/mailx -s "** $HOSTNAME$ [$HOSTADDRESS$] $HOSTSTATE$ - Status: $NOTIFICATIONTYPE$ ** Time in state $HOSTSTATE$: $HOSTDURATION$ **" -r nagios@nonexistentdomain.nix $CONTACTEMAIL$
	}
This is the part of the nagios.log file where the host goes down (name of hosts modified):
[Tue Feb 9 11:25:17 2016] HOST ALERT: PLUTO;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 13.01 ms
[Tue Feb 9 11:25:17 2016] HOST NOTIFICATION: beppe;PLUTO;UP;host-mail-noCC;PING OK - Packet loss = 0%, RTA = 13.01 ms
[Tue Feb 9 11:25:17 2016] HOST NOTIFICATION: lscontact;PLUTO;UP;autoticket-ls;PING OK - Packet loss = 0%, RTA = 13.01 ms
[Tue Feb 9 11:25:17 2016] HOST NOTIFICATION: lscontact;PLUTO;UP;host-mail-noCC;PING OK - Packet loss = 0%, RTA = 13.01 ms
[Tue Feb 9 11:25:17 2016] HOST NOTIFICATION: customercontact;PLUTO;UP;customailerhost;PING OK - Packet loss = 0%, RTA = 13.01 ms
[Tue Feb 9 11:40:06 2016] HOST ALERT: XX-YYY-STORAG-01;DOWN;SOFT;1;CRITICAL - Host Unreachable (172.16.25.19)
[Tue Feb 9 11:40:28 2016] EXTERNAL COMMAND: SCHEDULE_FORCED_HOST_CHECK;XX-YYY-STORAG-01;1455014427
[Tue Feb 9 11:40:31 2016] HOST ALERT: XX-YYY-STORAG-01;DOWN;SOFT;2;CRITICAL - Host Unreachable (172.16.25.19)
[Tue Feb 9 11:40:44 2016] EXTERNAL COMMAND: SCHEDULE_FORCED_HOST_CHECK;XX-YYY-STORAG-01;1455014443
[Tue Feb 9 11:40:47 2016] HOST ALERT: XX-YYY-STORAG-01;DOWN;HARD;3;CRITICAL - Host Unreachable (172.16.25.19)
[Tue Feb 9 12:53:54 2016] HOST ALERT: SATURN;DOWN;SOFT;1;PING CRITICAL - Packet loss = 100%
[Tue Feb 9 12:56:09 2016] HOST ALERT: SATURN;DOWN;SOFT;2;PING CRITICAL - Packet loss = 100%
[Tue Feb 9 12:58:24 2016] HOST ALERT: SATURN;DOWN;HARD;3;PING CRITICAL - Packet loss = 100%
[Tue Feb 9 13:03:28 2016] HOST ALERT: SATURN;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 9.60 ms
Yeah! I've just discovered that it's not a single host matter!


Any ideas or suggestion?

In the meantime I'm rebooting...


Thank you in advance.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: No notification on host down

Post by rkennedy »

Can you post a tail of your maillog file? I wonder if it's getting filtered somewhere.

Code: Select all

tail -n200 /var/log/maillog
Former Nagios Employee
Tron911
Posts: 7
Joined: Tue Feb 09, 2016 5:44 am

Re: No notification on host down

Post by Tron911 »

Yes sir, but nothing there. I suppose that the problem is before postfix, it seems that the notification is not thrown by the Nagios after the failure of the check_host command.

Code: Select all

[root@zz-sed-monit01 ~]# tail -n200 /var/log/maillog
Feb  9 06:08:38 zz-sed-monit01 postfix/pickup[19408]: 005E84011142: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 06:08:38 zz-sed-monit01 postfix/cleanup[23480]: 005E84011142: message-id=<56b97455.Lw34oFKssnHg/asf%nagios@nonexistentdomain.nix>
Feb  9 06:08:38 zz-sed-monit01 postfix/qmgr[2572]: 005E84011142: from=<nagios@nonexistentdomain.nix>, size=633, nrcpt=1 (queue active)
Feb  9 06:08:39 zz-sed-monit01 postfix/smtp[23484]: 005E84011142: to=<alerts_servers@nonexistentdomain.nix>, relay=172.16.2.12[172.16.2.12]:25, delay=1.8, delays=0.02/0.03/0.01/1.7, dsn=2.0.0, status=sent (250 OK)
Feb  9 06:08:39 zz-sed-monit01 postfix/qmgr[2572]: 005E84011142: removed
Feb  9 06:08:40 zz-sed-monit01 postfix/smtp[23482]: EE2D240AEAAF: to=<beppe@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=2.1, delays=0.08/0.02/0.01/2, dsn=2.0.0, status=sent (250 OK)
Feb  9 06:08:40 zz-sed-monit01 postfix/qmgr[2572]: EE2D240AEAAF: removed
Feb  9 06:08:40 zz-sed-monit01 postfix/smtp[23483]: F2C7940AEAB4: to=<hd.customer@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=2.4, delays=0.04/0.02/0.02/2.3, dsn=2.0.0, status=sent (250 OK)
Feb  9 06:08:40 zz-sed-monit01 postfix/qmgr[2572]: F2C7940AEAB4: removed
Feb  9 06:16:24 zz-sed-monit01 postfix/pickup[19408]: 86E284011142: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 06:16:24 zz-sed-monit01 postfix/cleanup[25481]: 86E284011142: message-id=<56b97628.5/zES9ubEb6RLJbP%nagios@nonexistentdomain.nix>
Feb  9 06:16:24 zz-sed-monit01 postfix/qmgr[2572]: 86E284011142: from=<nagios@nonexistentdomain.nix>, size=594, nrcpt=1 (queue active)
Feb  9 06:16:26 zz-sed-monit01 postfix/smtp[25483]: 86E284011142: to=<Ufficio.tecnico@nonexistentdomain.nix>, relay=172.16.2.12[172.16.2.12]:25, delay=2.4, delays=0.05/0.01/0.01/2.4, dsn=2.0.0, status=sent (250 OK)
Feb  9 06:16:26 zz-sed-monit01 postfix/qmgr[2572]: 86E284011142: removed
Feb  9 10:15:02 zz-sed-monit01 postfix/pickup[4506]: 695414011142: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 10:15:02 zz-sed-monit01 postfix/cleanup[21394]: 695414011142: message-id=<56b9ae16.dd841tO0+OYuYPpm%nagios@nonexistentdomain.nix>
Feb  9 10:15:02 zz-sed-monit01 postfix/pickup[4506]: 6CD3F401114E: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 10:15:02 zz-sed-monit01 postfix/qmgr[2572]: 695414011142: from=<nagios@nonexistentdomain.nix>, size=811, nrcpt=1 (queue active)
Feb  9 10:15:02 zz-sed-monit01 postfix/cleanup[21394]: 6CD3F401114E: message-id=<56b9ae16.c2OJgPLr353I34gf%nagios@nonexistentdomain.nix>
Feb  9 10:15:02 zz-sed-monit01 postfix/qmgr[2572]: 6CD3F401114E: from=<nagios@nonexistentdomain.nix>, size=806, nrcpt=1 (queue active)
Feb  9 10:15:03 zz-sed-monit01 postfix/smtp[21396]: 695414011142: to=<beppe@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=1.3, delays=0.05/0.01/0.01/1.2, dsn=2.0.0, status=sent (250 OK)
Feb  9 10:15:03 zz-sed-monit01 postfix/qmgr[2572]: 695414011142: removed
Feb  9 10:15:03 zz-sed-monit01 postfix/smtp[21397]: 6CD3F401114E: to=<hd.customer@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=1.6, delays=0.05/0.02/0.01/1.5, dsn=2.0.0, status=sent (250 OK)
Feb  9 10:15:03 zz-sed-monit01 postfix/qmgr[2572]: 6CD3F401114E: removed
Feb  9 11:25:17 zz-sed-monit01 postfix/pickup[29723]: 32F4340AEAAF: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 11:25:17 zz-sed-monit01 postfix/cleanup[6946]: 32F4340AEAAF: message-id=<56b9be8d.bTtMkuvEYKOPSFdV%nagios@nonexistentdomain.nix>
Feb  9 11:25:17 zz-sed-monit01 postfix/qmgr[2572]: 32F4340AEAAF: from=<nagios@nonexistentdomain.nix>, size=813, nrcpt=1 (queue active)
Feb  9 11:25:17 zz-sed-monit01 postfix/pickup[29723]: 3770640AEAB4: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 11:25:17 zz-sed-monit01 postfix/cleanup[6946]: 3770640AEAB4: message-id=<56b9be8d.546bNj/Ue8s1Ihbr%nagios@nonexistentdomain.nix>
Feb  9 11:25:17 zz-sed-monit01 postfix/qmgr[2572]: 3770640AEAB4: from=<nagios@nonexistentdomain.nix>, size=808, nrcpt=1 (queue active)
Feb  9 11:25:19 zz-sed-monit01 postfix/smtp[6951]: 3770640AEAB4: to=<hd.customer@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=2.1, delays=0.05/0.01/0.02/2, dsn=2.0.0, status=sent (250 OK)
Feb  9 11:25:19 zz-sed-monit01 postfix/qmgr[2572]: 3770640AEAB4: removed
Feb  9 11:25:19 zz-sed-monit01 postfix/smtp[6950]: 32F4340AEAAF: to=<beppe@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=2.4, delays=0.06/0.01/0.01/2.3, dsn=2.0.0, status=sent (250 OK)
Feb  9 11:25:19 zz-sed-monit01 postfix/qmgr[2572]: 32F4340AEAAF: removed
Feb  9 12:14:48 zz-sed-monit01 postfix/pickup[29723]: 2185A40AEAAF: uid=0 from=<root>
Feb  9 12:14:48 zz-sed-monit01 postfix/cleanup[19809]: 2185A40AEAAF: message-id=<20160209111448.2185A40AEAAF@zz-sed-monit01.it.customer.grp>
Feb  9 12:14:48 zz-sed-monit01 postfix/qmgr[2572]: 2185A40AEAAF: from=<root@zz-sed-monit01.it.customer.grp>, size=691, nrcpt=4 (queue active)
Feb  9 12:14:48 zz-sed-monit01 postfix/local[19812]: 2185A40AEAAF: to=<root@zz-sed-monit01.it.customer.grp>, relay=local, delay=0.07, delays=0.03/0.03/0/0.01, dsn=2.0.0, status=sent (delivered to mailbox)
Feb  9 12:14:48 zz-sed-monit01 postfix/local[19814]: 2185A40AEAAF: to=<Raggiungile@zz-sed-monit01.it.customer.grp>, orig_to=<Raggiungile>, relay=local, delay=0.12, delays=0.03/0.04/0/0.05, dsn=5.1.1, status=bounced (unknown user: "raggiungile")
Feb  9 12:14:48 zz-sed-monit01 postfix/error[19811]: 2185A40AEAAF: to=<-r@zz-sed-monit01.it.customer.grp>, orig_to=<-r>, relay=none, delay=0.12, delays=0.03/0.05/0/0.04, dsn=5.1.3, status=bounced (bad address syntax)
Feb  9 12:14:51 zz-sed-monit01 postfix/smtp[19813]: 2185A40AEAAF: to=<nagios@nonexistentdomain.nix>, relay=172.16.2.12[172.16.2.12]:25, delay=3.2, delays=0.03/0.03/0.01/3.1, dsn=2.0.0, status=sent (250 OK)
Feb  9 12:14:51 zz-sed-monit01 postfix/cleanup[19809]: 42D2340AEAB4: message-id=<20160209111451.42D2340AEAB4@zz-sed-monit01.it.customer.grp>
Feb  9 12:14:51 zz-sed-monit01 postfix/bounce[19816]: 2185A40AEAAF: sender non-delivery notification: 42D2340AEAB4
Feb  9 12:14:51 zz-sed-monit01 postfix/qmgr[2572]: 42D2340AEAB4: from=<>, size=3021, nrcpt=1 (queue active)
Feb  9 12:14:51 zz-sed-monit01 postfix/qmgr[2572]: 2185A40AEAAF: removed
Feb  9 12:14:51 zz-sed-monit01 postfix/local[19812]: 42D2340AEAB4: to=<root@zz-sed-monit01.it.customer.grp>, relay=local, delay=0.01, delays=0/0/0/0, dsn=2.0.0, status=sent (delivered to mailbox)
Feb  9 12:14:51 zz-sed-monit01 postfix/qmgr[2572]: 42D2340AEAB4: removed
Feb  9 12:15:09 zz-sed-monit01 postfix/pickup[29723]: 651AA40AEAAF: uid=0 from=<root>
Feb  9 12:15:09 zz-sed-monit01 postfix/cleanup[19809]: 651AA40AEAAF: message-id=<20160209111509.651AA40AEAAF@zz-sed-monit01.it.customer.grp>
Feb  9 12:15:09 zz-sed-monit01 postfix/qmgr[2572]: 651AA40AEAAF: from=<root@zz-sed-monit01.it.customer.grp>, size=691, nrcpt=4 (queue active)
Feb  9 12:15:09 zz-sed-monit01 postfix/error[19811]: 651AA40AEAAF: to=<-r@zz-sed-monit01.it.customer.grp>, orig_to=<-r>, relay=none, delay=0.01, delays=0.01/0/0/0, dsn=5.1.3, status=bounced (bad address syntax)
Feb  9 12:15:09 zz-sed-monit01 postfix/local[19814]: 651AA40AEAAF: to=<root@zz-sed-monit01.it.customer.grp>, relay=local, delay=0.01, delays=0.01/0/0/0, dsn=2.0.0, status=sent (delivered to mailbox)
Feb  9 12:15:10 zz-sed-monit01 postfix/local[19812]: 651AA40AEAAF: to=<Raggiungile@zz-sed-monit01.it.customer.grp>, orig_to=<Raggiungile>, relay=local, delay=1, delays=0.01/0/0/0.99, dsn=5.1.1, status=bounced (unknown user: "raggiungile")
Feb  9 12:15:11 zz-sed-monit01 postfix/smtp[19813]: 651AA40AEAAF: to=<nagios@nonexistentdomain.nix>, relay=172.16.2.12[172.16.2.12]:25, delay=1.7, delays=0.01/0/0.01/1.7, dsn=2.0.0, status=sent (250 OK)
Feb  9 12:15:11 zz-sed-monit01 postfix/cleanup[19809]: 1641F40AEAB4: message-id=<20160209111511.1641F40AEAB4@zz-sed-monit01.it.customer.grp>
Feb  9 12:15:11 zz-sed-monit01 postfix/bounce[19815]: 651AA40AEAAF: sender non-delivery notification: 1641F40AEAB4
Feb  9 12:15:11 zz-sed-monit01 postfix/qmgr[2572]: 1641F40AEAB4: from=<>, size=3021, nrcpt=1 (queue active)
Feb  9 12:15:11 zz-sed-monit01 postfix/qmgr[2572]: 651AA40AEAAF: removed
Feb  9 12:15:11 zz-sed-monit01 postfix/local[19814]: 1641F40AEAB4: to=<root@zz-sed-monit01.it.customer.grp>, relay=local, delay=0.01, delays=0/0/0/0, dsn=2.0.0, status=sent (delivered to mailbox)
Feb  9 12:15:11 zz-sed-monit01 postfix/qmgr[2572]: 1641F40AEAB4: removed
Feb  9 12:15:19 zz-sed-monit01 postfix/pickup[29723]: A5E5E40AEAAF: uid=0 from=<root>
Feb  9 12:15:19 zz-sed-monit01 postfix/cleanup[19809]: A5E5E40AEAAF: message-id=<20160209111519.A5E5E40AEAAF@zz-sed-monit01.it.customer.grp>
Feb  9 12:15:19 zz-sed-monit01 postfix/qmgr[2572]: A5E5E40AEAAF: from=<root@zz-sed-monit01.it.customer.grp>, size=691, nrcpt=4 (queue active)
Feb  9 12:15:19 zz-sed-monit01 postfix/error[19811]: A5E5E40AEAAF: to=<-r@zz-sed-monit01.it.customer.grp>, orig_to=<-r>, relay=none, delay=0.01, delays=0.01/0/0/0, dsn=5.1.3, status=bounced (bad address syntax)
Feb  9 12:15:19 zz-sed-monit01 postfix/local[19812]: A5E5E40AEAAF: to=<root@zz-sed-monit01.it.customer.grp>, relay=local, delay=0.01, delays=0.01/0/0/0, dsn=2.0.0, status=sent (delivered to mailbox)
Feb  9 12:15:20 zz-sed-monit01 postfix/smtp[19813]: A5E5E40AEAAF: to=<nagios@nonexistentdomain.nix>, relay=172.16.2.12[172.16.2.12]:25, delay=1.2, delays=0.01/0/0.01/1.1, dsn=2.0.0, status=sent (250 OK)
Feb  9 12:15:20 zz-sed-monit01 postfix/local[19814]: A5E5E40AEAAF: to=<Raggiungile@zz-sed-monit01.it.customer.grp>, orig_to=<Raggiungile>, relay=local, delay=1.2, delays=0.01/0/0/1.2, dsn=5.1.1, status=bounced (unknown user: "raggiungile")
Feb  9 12:15:20 zz-sed-monit01 postfix/cleanup[19809]: E1FF040AEAB4: message-id=<20160209111520.E1FF040AEAB4@zz-sed-monit01.it.customer.grp>
Feb  9 12:15:20 zz-sed-monit01 postfix/qmgr[2572]: E1FF040AEAB4: from=<>, size=3021, nrcpt=1 (queue active)
Feb  9 12:15:20 zz-sed-monit01 postfix/bounce[19816]: A5E5E40AEAAF: sender non-delivery notification: E1FF040AEAB4
Feb  9 12:15:20 zz-sed-monit01 postfix/qmgr[2572]: A5E5E40AEAAF: removed
Feb  9 12:15:20 zz-sed-monit01 postfix/local[19812]: E1FF040AEAB4: to=<root@zz-sed-monit01.it.customer.grp>, relay=local, delay=0.01, delays=0/0/0/0, dsn=2.0.0, status=sent (delivered to mailbox)
Feb  9 12:15:20 zz-sed-monit01 postfix/qmgr[2572]: E1FF040AEAB4: removed
Feb  9 12:26:27 zz-sed-monit01 postfix/pickup[21791]: 3EEC940AEAAF: uid=0 from=<nagios@nonexistentdomain.nix>
Feb  9 12:26:27 zz-sed-monit01 postfix/cleanup[22887]: 3EEC940AEAAF: message-id=<56b9cce3.eFE9W0m46uLFBo75%nagios@nonexistentdomain.nix>
Feb  9 12:26:27 zz-sed-monit01 postfix/qmgr[2572]: 3EEC940AEAAF: from=<nagios@nonexistentdomain.nix>, size=598, nrcpt=1 (queue active)
Feb  9 12:26:28 zz-sed-monit01 postfix/smtp[22889]: 3EEC940AEAAF: to=<beppe@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=1.6, delays=0.03/0.01/0.01/1.5, dsn=2.0.0, status=sent (250 OK)
Feb  9 12:26:28 zz-sed-monit01 postfix/qmgr[2572]: 3EEC940AEAAF: removed
Feb  9 12:26:57 zz-sed-monit01 postfix/pickup[21791]: 5167E40AEAAF: uid=0 from=<nagios@nonexistentdomain.nix>
Feb  9 12:26:57 zz-sed-monit01 postfix/cleanup[22887]: 5167E40AEAAF: message-id=<56b9cd01.UptSc1xL0l5M1PA3%nagios@nonexistentdomain.nix>
Feb  9 12:26:57 zz-sed-monit01 postfix/qmgr[2572]: 5167E40AEAAF: from=<nagios@nonexistentdomain.nix>, size=613, nrcpt=1 (queue active)
Feb  9 12:26:58 zz-sed-monit01 postfix/smtp[22889]: 5167E40AEAAF: to=<beppe@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=1.2, delays=0.01/0/0.01/1.2, dsn=2.0.0, status=sent (250 OK)
Feb  9 12:26:58 zz-sed-monit01 postfix/qmgr[2572]: 5167E40AEAAF: removed
Feb  9 13:14:39 zz-sed-monit01 postfix/pickup[21791]: 257C140AEAAF: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 13:14:39 zz-sed-monit01 postfix/cleanup[2823]: 257C140AEAAF: message-id=<56b9d82f.bDdpra9q2mybhtS1%nagios@nonexistentdomain.nix>
Feb  9 13:14:39 zz-sed-monit01 postfix/qmgr[2572]: 257C140AEAAF: from=<nagios@nonexistentdomain.nix>, size=889, nrcpt=1 (queue active)
Feb  9 13:14:39 zz-sed-monit01 postfix/pickup[21791]: 29CE940AEAB4: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 13:14:39 zz-sed-monit01 postfix/cleanup[2823]: 29CE940AEAB4: message-id=<56b9d82f.6pRSJsg4oJ+wdfI6%nagios@nonexistentdomain.nix>
Feb  9 13:14:39 zz-sed-monit01 postfix/qmgr[2572]: 29CE940AEAB4: from=<nagios@nonexistentdomain.nix>, size=884, nrcpt=1 (queue active)
Feb  9 13:14:40 zz-sed-monit01 postfix/smtp[2827]: 29CE940AEAB4: to=<hd.customer@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=1.2, delays=0.06/0.01/0.03/1.1, dsn=2.0.0, status=sent (250 OK)
Feb  9 13:14:40 zz-sed-monit01 postfix/qmgr[2572]: 29CE940AEAB4: removed
Feb  9 13:14:40 zz-sed-monit01 postfix/smtp[2826]: 257C140AEAAF: to=<beppe@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=1.3, delays=0.07/0.02/0.02/1.2, dsn=2.0.0, status=sent (250 OK)
Feb  9 13:14:40 zz-sed-monit01 postfix/qmgr[2572]: 257C140AEAAF: removed
Feb  9 13:24:38 zz-sed-monit01 postfix/pickup[21791]: 31FA540AEAAF: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 13:24:38 zz-sed-monit01 postfix/cleanup[5368]: 31FA540AEAAF: message-id=<56b9da86.djyl9Cw6yJp+Ue0M%nagios@nonexistentdomain.nix>
Feb  9 13:24:38 zz-sed-monit01 postfix/qmgr[2572]: 31FA540AEAAF: from=<nagios@nonexistentdomain.nix>, size=869, nrcpt=1 (queue active)
Feb  9 13:24:38 zz-sed-monit01 postfix/pickup[21791]: 3544040AEAB4: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 13:24:38 zz-sed-monit01 postfix/cleanup[5368]: 3544040AEAB4: message-id=<56b9da86.AoXuj7+Uj2b6sAdw%nagios@nonexistentdomain.nix>
Feb  9 13:24:38 zz-sed-monit01 postfix/qmgr[2572]: 3544040AEAB4: from=<nagios@nonexistentdomain.nix>, size=874, nrcpt=1 (queue active)
Feb  9 13:24:39 zz-sed-monit01 postfix/smtp[5371]: 3544040AEAB4: to=<beppe@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=1.3, delays=0.05/0.02/0.04/1.2, dsn=2.0.0, status=sent (250 OK)
Feb  9 13:24:39 zz-sed-monit01 postfix/qmgr[2572]: 3544040AEAB4: removed
Feb  9 13:24:40 zz-sed-monit01 postfix/smtp[5370]: 31FA540AEAAF: to=<hd.customer@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=2.1, delays=0.05/0.01/0.01/2, dsn=2.0.0, status=sent (250 OK)
Feb  9 13:24:40 zz-sed-monit01 postfix/qmgr[2572]: 31FA540AEAAF: removed
Feb  9 14:31:23 zz-sed-monit01 postfix/pickup[13309]: 216F340AEAAF: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 14:31:23 zz-sed-monit01 postfix/cleanup[22222]: 216F340AEAAF: message-id=<56b9ea2b.EGixDRIYXBaoMV0V%nagios@nonexistentdomain.nix>
Feb  9 14:31:23 zz-sed-monit01 postfix/qmgr[2572]: 216F340AEAAF: from=<nagios@nonexistentdomain.nix>, size=594, nrcpt=1 (queue active)
Feb  9 14:31:24 zz-sed-monit01 postfix/smtp[22225]: 216F340AEAAF: to=<Ufficio.tecnico@nonexistentdomain.nix>, relay=172.16.2.12[172.16.2.12]:25, delay=1, delays=0.03/0.01/0.01/0.98, dsn=2.0.0, status=sent (250 OK)
Feb  9 14:31:24 zz-sed-monit01 postfix/qmgr[2572]: 216F340AEAAF: removed
Feb  9 14:42:44 zz-sed-monit01 postfix/postfix-script[2178]: starting the Postfix mail system
Feb  9 14:42:44 zz-sed-monit01 postfix/master[2201]: daemon started -- version 2.10.1, configuration /etc/postfix
Feb  9 14:43:16 zz-sed-monit01 postfix/pickup[2235]: 55A9C40AEAAF: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 14:43:16 zz-sed-monit01 postfix/cleanup[2904]: 55A9C40AEAAF: message-id=<56b9ecf4.P3FADXPYczO0gD6V%nagios@nonexistentdomain.nix>
Feb  9 14:43:16 zz-sed-monit01 postfix/qmgr[2236]: 55A9C40AEAAF: from=<nagios@nonexistentdomain.nix>, size=781, nrcpt=1 (queue active)
Feb  9 14:43:16 zz-sed-monit01 postfix/pickup[2235]: 601FA40AEAB4: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 14:43:16 zz-sed-monit01 postfix/cleanup[2904]: 601FA40AEAB4: message-id=<56b9ecf4.q8HnJqV9Ud46rJAe%nagios@nonexistentdomain.nix>
Feb  9 14:43:16 zz-sed-monit01 postfix/qmgr[2236]: 601FA40AEAB4: from=<nagios@nonexistentdomain.nix>, size=776, nrcpt=1 (queue active)
Feb  9 14:43:16 zz-sed-monit01 postfix/pickup[2235]: 61759401114D: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 14:43:16 zz-sed-monit01 postfix/cleanup[2904]: 61759401114D: message-id=<56b9ecf4.vnHrM2Y3q721BYZO%nagios@nonexistentdomain.nix>
Feb  9 14:43:16 zz-sed-monit01 postfix/qmgr[2236]: 61759401114D: from=<nagios@nonexistentdomain.nix>, size=622, nrcpt=1 (queue active)
Feb  9 14:43:17 zz-sed-monit01 postfix/smtp[2908]: 61759401114D: to=<alerts_servers@nonexistentdomain.nix>, relay=172.16.2.12[172.16.2.12]:25, delay=1.3, delays=0.11/0.03/0.04/1.1, dsn=2.0.0, status=sent (250 OK)
Feb  9 14:43:17 zz-sed-monit01 postfix/qmgr[2236]: 61759401114D: removed
Feb  9 14:43:17 zz-sed-monit01 postfix/smtp[2906]: 55A9C40AEAAF: to=<beppe@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=1.4, delays=0.11/0.03/0.01/1.2, dsn=2.0.0, status=sent (250 OK)
Feb  9 14:43:17 zz-sed-monit01 postfix/qmgr[2236]: 55A9C40AEAAF: removed
Feb  9 14:43:18 zz-sed-monit01 postfix/smtp[2907]: 601FA40AEAB4: to=<hd.customer@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=2.2, delays=0.11/0.03/0.01/2, dsn=2.0.0, status=sent (250 OK)
Feb  9 14:43:18 zz-sed-monit01 postfix/qmgr[2236]: 601FA40AEAB4: removed
Feb  9 14:46:40 zz-sed-monit01 postfix/pickup[2235]: 7C1BA401114D: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 14:46:40 zz-sed-monit01 postfix/cleanup[3824]: 7C1BA401114D: message-id=<56b9edc0.jU/QA89vTlx5i5qG%nagios@nonexistentdomain.nix>
Feb  9 14:46:40 zz-sed-monit01 postfix/pickup[2235]: 7FA0D401114E: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 14:46:40 zz-sed-monit01 postfix/qmgr[2236]: 7C1BA401114D: from=<nagios@nonexistentdomain.nix>, size=795, nrcpt=1 (queue active)
Feb  9 14:46:40 zz-sed-monit01 postfix/cleanup[3824]: 7FA0D401114E: message-id=<56b9edc0.dxh9qKhgz9zcrMJt%nagios@nonexistentdomain.nix>
Feb  9 14:46:40 zz-sed-monit01 postfix/qmgr[2236]: 7FA0D401114E: from=<nagios@nonexistentdomain.nix>, size=790, nrcpt=1 (queue active)
Feb  9 14:46:42 zz-sed-monit01 postfix/smtp[3827]: 7FA0D401114E: to=<hd.customer@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=2.2, delays=0.03/0.01/0.03/2.1, dsn=2.0.0, status=sent (250 OK)
Feb  9 14:46:42 zz-sed-monit01 postfix/qmgr[2236]: 7FA0D401114E: removed
Feb  9 14:46:45 zz-sed-monit01 postfix/smtp[3826]: 7C1BA401114D: to=<beppe@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=5, delays=0.05/0.01/0.01/5, dsn=2.0.0, status=sent (250 OK)
Feb  9 14:46:45 zz-sed-monit01 postfix/qmgr[2236]: 7C1BA401114D: removed
Feb  9 15:04:19 zz-sed-monit01 postfix/pickup[2235]: B768E401114D: uid=0 from=<root>
Feb  9 15:04:19 zz-sed-monit01 postfix/cleanup[8361]: B768E401114D: message-id=<20160209140419.B768E401114D@zz-sed-monit01.it.customer.grp>
Feb  9 15:04:19 zz-sed-monit01 postfix/qmgr[2236]: B768E401114D: from=<root@zz-sed-monit01.it.customer.grp>, size=1112, nrcpt=1 (queue active)
Feb  9 15:04:19 zz-sed-monit01 postfix/local[8364]: B768E401114D: to=<root@zz-sed-monit01.it.customer.grp>, orig_to=<root>, relay=local, delay=0.1, delays=0.03/0.05/0/0.02, dsn=2.0.0, status=sent (delivered to mailbox)
Feb  9 15:04:19 zz-sed-monit01 postfix/qmgr[2236]: B768E401114D: removed
Feb  9 15:24:40 zz-sed-monit01 postfix/pickup[2235]: 063B6401114D: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 15:24:40 zz-sed-monit01 postfix/cleanup[13667]: 063B6401114D: message-id=<56b9f6a7.NPagiXFrrhGwdy89%nagios@nonexistentdomain.nix>
Feb  9 15:24:40 zz-sed-monit01 postfix/qmgr[2236]: 063B6401114D: from=<nagios@nonexistentdomain.nix>, size=784, nrcpt=1 (queue active)
Feb  9 15:24:40 zz-sed-monit01 postfix/pickup[2235]: 0A5EB401114E: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 15:24:40 zz-sed-monit01 postfix/cleanup[13667]: 0A5EB401114E: message-id=<56b9f6a7.en28ZE2ogpT93NHh%nagios@nonexistentdomain.nix>
Feb  9 15:24:40 zz-sed-monit01 postfix/qmgr[2236]: 0A5EB401114E: from=<nagios@nonexistentdomain.nix>, size=779, nrcpt=1 (queue active)
Feb  9 15:24:40 zz-sed-monit01 postfix/pickup[2235]: 0D4FB4011151: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 15:24:40 zz-sed-monit01 postfix/cleanup[13667]: 0D4FB4011151: message-id=<56b9f6a7.5plMiMcqfal8QivX%nagios@nonexistentdomain.nix>
Feb  9 15:24:40 zz-sed-monit01 postfix/qmgr[2236]: 0D4FB4011151: from=<nagios@nonexistentdomain.nix>, size=629, nrcpt=1 (queue active)
Feb  9 15:24:40 zz-sed-monit01 postfix/smtp[13672]: 0D4FB4011151: to=<alerts_servers@nonexistentdomain.nix>, relay=172.16.2.12[172.16.2.12]:25, delay=0.92, delays=0.03/0.01/0.04/0.84, dsn=2.0.0, status=sent (250 OK)
Feb  9 15:24:40 zz-sed-monit01 postfix/qmgr[2236]: 0D4FB4011151: removed
Feb  9 15:24:41 zz-sed-monit01 postfix/smtp[13671]: 0A5EB401114E: to=<hd.customer@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=1.2, delays=0.07/0.01/0.01/1.1, dsn=2.0.0, status=sent (250 OK)
Feb  9 15:24:41 zz-sed-monit01 postfix/qmgr[2236]: 0A5EB401114E: removed
Feb  9 15:24:41 zz-sed-monit01 postfix/smtp[13670]: 063B6401114D: to=<beppe@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=1.4, delays=0.06/0.02/0.01/1.4, dsn=2.0.0, status=sent (250 OK)
Feb  9 15:24:41 zz-sed-monit01 postfix/qmgr[2236]: 063B6401114D: removed
Feb  9 15:28:24 zz-sed-monit01 postfix/pickup[2235]: 4E9B74035067: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 15:28:24 zz-sed-monit01 postfix/cleanup[14519]: 4E9B74035067: message-id=<56b9f788.dEIIfvG05npnHZ3L%nagios@nonexistentdomain.nix>
Feb  9 15:28:24 zz-sed-monit01 postfix/qmgr[2236]: 4E9B74035067: from=<nagios@nonexistentdomain.nix>, size=782, nrcpt=1 (queue active)
Feb  9 15:28:24 zz-sed-monit01 postfix/pickup[2235]: 530B0403506A: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 15:28:24 zz-sed-monit01 postfix/cleanup[14519]: 530B0403506A: message-id=<56b9f788.O1j3nGU/fq64B3bg%nagios@nonexistentdomain.nix>
Feb  9 15:28:24 zz-sed-monit01 postfix/qmgr[2236]: 530B0403506A: from=<nagios@nonexistentdomain.nix>, size=787, nrcpt=1 (queue active)
Feb  9 15:28:24 zz-sed-monit01 postfix/pickup[2235]: 55E51401114D: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 15:28:24 zz-sed-monit01 postfix/cleanup[14519]: 55E51401114D: message-id=<56b9f788.rel3ooawkPZEQH4J%nagios@nonexistentdomain.nix>
Feb  9 15:28:24 zz-sed-monit01 postfix/qmgr[2236]: 55E51401114D: from=<nagios@nonexistentdomain.nix>, size=635, nrcpt=1 (queue active)
Feb  9 15:28:25 zz-sed-monit01 postfix/smtp[14527]: 55E51401114D: to=<alerts_servers@nonexistentdomain.nix>, relay=172.16.2.12[172.16.2.12]:25, delay=1.3, delays=0.02/0.01/0.01/1.2, dsn=2.0.0, status=sent (250 OK)
Feb  9 15:28:25 zz-sed-monit01 postfix/qmgr[2236]: 55E51401114D: removed
Feb  9 15:28:26 zz-sed-monit01 postfix/smtp[14525]: 4E9B74035067: to=<hd.customer@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=2.2, delays=0.04/0.01/0.01/2.2, dsn=2.0.0, status=sent (250 OK)
Feb  9 15:28:26 zz-sed-monit01 postfix/qmgr[2236]: 4E9B74035067: removed
Feb  9 15:28:26 zz-sed-monit01 postfix/smtp[14526]: 530B0403506A: to=<beppe@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=2.3, delays=0.05/0.01/0.02/2.2, dsn=2.0.0, status=sent (250 OK)
Feb  9 15:28:26 zz-sed-monit01 postfix/qmgr[2236]: 530B0403506A: removed
Feb  9 15:31:28 zz-sed-monit01 postfix/pickup[2235]: 021B04022F13: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 15:31:28 zz-sed-monit01 postfix/cleanup[15494]: 021B04022F13: message-id=<56b9f83f.YcegDfwudffPAJfM%nagios@nonexistentdomain.nix>
Feb  9 15:31:28 zz-sed-monit01 postfix/qmgr[2236]: 021B04022F13: from=<nagios@nonexistentdomain.nix>, size=611, nrcpt=1 (queue active)
Feb  9 15:31:29 zz-sed-monit01 postfix/smtp[15496]: 021B04022F13: to=<Ufficio.tecnico@nonexistentdomain.nix>, relay=172.16.2.12[172.16.2.12]:25, delay=1.1, delays=0.02/0.01/0.01/1.1, dsn=2.0.0, status=sent (250 OK)
Feb  9 15:31:29 zz-sed-monit01 postfix/qmgr[2236]: 021B04022F13: removed
Feb  9 15:38:46 zz-sed-monit01 postfix/pickup[2235]: 12BEF4022F13: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 15:38:46 zz-sed-monit01 postfix/cleanup[17226]: 12BEF4022F13: message-id=<56b9f9f5.gGRR0oCF9CMEiraD%nagios@nonexistentdomain.nix>
Feb  9 15:38:46 zz-sed-monit01 postfix/qmgr[2236]: 12BEF4022F13: from=<nagios@nonexistentdomain.nix>, size=789, nrcpt=1 (queue active)
Feb  9 15:38:46 zz-sed-monit01 postfix/pickup[2235]: 1911A401114D: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 15:38:46 zz-sed-monit01 postfix/cleanup[17226]: 1911A401114D: message-id=<56b9f9f5.//k1xaRGt2xuXcZq%nagios@nonexistentdomain.nix>
Feb  9 15:38:46 zz-sed-monit01 postfix/qmgr[2236]: 1911A401114D: from=<nagios@nonexistentdomain.nix>, size=784, nrcpt=1 (queue active)
Feb  9 15:38:46 zz-sed-monit01 postfix/pickup[2235]: 1A58F401114E: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 15:38:46 zz-sed-monit01 postfix/cleanup[17226]: 1A58F401114E: message-id=<56b9f9f6.6AeY+l5IUOYKHHGy%nagios@nonexistentdomain.nix>
Feb  9 15:38:46 zz-sed-monit01 postfix/qmgr[2236]: 1A58F401114E: from=<nagios@nonexistentdomain.nix>, size=631, nrcpt=1 (queue active)
Feb  9 15:38:47 zz-sed-monit01 postfix/smtp[17238]: 1911A401114D: to=<hd.customer@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=1.4, delays=0.07/0.03/0.01/1.3, dsn=2.0.0, status=sent (250 OK)
Feb  9 15:38:47 zz-sed-monit01 postfix/qmgr[2236]: 1911A401114D: removed
Feb  9 15:38:47 zz-sed-monit01 postfix/smtp[17237]: 12BEF4022F13: to=<beppe@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=1.6, delays=0.07/0.02/0.02/1.5, dsn=2.0.0, status=sent (250 OK)
Feb  9 15:38:47 zz-sed-monit01 postfix/qmgr[2236]: 12BEF4022F13: removed
Feb  9 15:38:48 zz-sed-monit01 postfix/smtp[17240]: 1A58F401114E: to=<alerts_servers@nonexistentdomain.nix>, relay=172.16.2.12[172.16.2.12]:25, delay=1.9, delays=0.01/0.03/0.04/1.8, dsn=2.0.0, status=sent (250 OK)
Feb  9 15:38:48 zz-sed-monit01 postfix/qmgr[2236]: 1A58F401114E: removed
Feb  9 15:45:00 zz-sed-monit01 postfix/pickup[2235]: 4F98E401114D: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 15:45:00 zz-sed-monit01 postfix/cleanup[18930]: 4F98E401114D: message-id=<56b9fb6c.tDDG6Mf8L4IplZTl%nagios@nonexistentdomain.nix>
Feb  9 15:45:00 zz-sed-monit01 postfix/qmgr[2236]: 4F98E401114D: from=<nagios@nonexistentdomain.nix>, size=784, nrcpt=1 (queue active)
Feb  9 15:45:00 zz-sed-monit01 postfix/pickup[2235]: 5294C4016EF3: uid=1000 from=<nagios@nonexistentdomain.nix>
Feb  9 15:45:00 zz-sed-monit01 postfix/cleanup[18930]: 5294C4016EF3: message-id=<56b9fb6c.sbXRCHMz+pABlxaJ%nagios@nonexistentdomain.nix>
Feb  9 15:45:00 zz-sed-monit01 postfix/qmgr[2236]: 5294C4016EF3: from=<nagios@nonexistentdomain.nix>, size=789, nrcpt=1 (queue active)
Feb  9 15:45:01 zz-sed-monit01 postfix/smtp[18933]: 5294C4016EF3: to=<beppe@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=1.4, delays=0.05/0.02/0.01/1.3, dsn=2.0.0, status=sent (250 OK)
Feb  9 15:45:01 zz-sed-monit01 postfix/qmgr[2236]: 5294C4016EF3: removed
Feb  9 15:45:04 zz-sed-monit01 postfix/smtp[18932]: 4F98E401114D: to=<hd.customer@fallout.it>, relay=172.16.2.12[172.16.2.12]:25, delay=4.6, delays=0.07/0.01/3/1.5, dsn=2.0.0, status=sent (250 OK)
Feb  9 15:45:04 zz-sed-monit01 postfix/qmgr[2236]: 4F98E401114D: removed
I've enabled notification debug (32, very detailed) and emulated a down. Notification executed (down made by adding a fake route).

I've seen a lot of messages about notification suppression in the nagios.debug file, but they appear to be correct:

Code: Select all

[omitted output]
[Wed Feb 10 09:34:22 2016.682666] [032.0] [pid=18098] ** Service Notification Attempt ** Host: 'VENUS', Service: 'CHECK ESX DATASTORE', Type: NORMAL, Options: 0, Current State: 1, Last Notification: Thu Jan 28 16:55:16 2016
[Wed Feb 10 09:34:22 2016.682708] [032.1] [pid=18098] We shouldn't re-notify contacts about this service problem.
[Wed Feb 10 09:34:22 2016.682713] [032.0] [pid=18098] Notification viability test failed.  No notification will be sent out.
[omitted output]
Any ideas?

Thank you.
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: No notification on host down

Post by tgriep »

Is that host in a Host Dependancies group?
If so, can you post the settings for it?

Also, can you open this file, find the host in it and post that here?

Code: Select all

/usr/local/nagios/var/objects.cache
Be sure to check out our Knowledgebase for helpful articles and solutions!
Tron911
Posts: 7
Joined: Tue Feb 09, 2016 5:44 am

Re: No notification on host down

Post by Tron911 »

This is the dependency tree:

Code: Select all

define hostdependency {
	host_name	XX-YYY-FIREWA01
	dependent_host_name	XX-YYY-STORAG01
	inherits_parent	1
	notification_failure_options	d,u
	}

define hostdependency {
	host_name	XX-YYY-MCLINK
	dependent_host_name	XX-YYY-FIREWA01
	inherits_parent	1
	notification_failure_options	d,u
	}

define hostdependency {
	host_name	XX-YYY-TELECOM
	dependent_host_name	XX-YYY-FIREWA01
	inherits_parent	1
	notification_failure_options	d,u
	}

This is the host in the /usr/local/nagios/var/objects.cache file:

Code: Select all

define host {
	host_name	XX-YYY-STORAG01
	alias	XX-YYY-STORAG01
	address	172.16.25.19
	check_period	24x7
	check_command	check-ls-host-alive
	contacts	customercontact,reperibility-system-day,reperibility-network-day,lscontact,beppe
	notification_period	24x7
	initial_state	o
	importance	0
	check_interval	5.000000
	retry_interval	2.000000
	max_check_attempts	3
	active_checks_enabled	1
	passive_checks_enabled	1
	obsess	1
	event_handler_enabled	1
	low_flap_threshold	0.000000
	high_flap_threshold	0.000000
	flap_detection_enabled	0
	flap_detection_options	a
	freshness_threshold	0
	check_freshness	0
	notification_options	r,d
	notifications_enabled	1
	notification_interval	0.000000
	first_notification_delay	0.000000
	stalking_options	n
	process_perf_data	1
	retain_status_information	1
	retain_nonstatus_information	1
	}
Another notification lost... This is the output of a cat /nagios/var/nagios.log | perl -pe 's/(\d+)/localtime($1)/e' | grep "Feb 11 06:":

Code: Select all

[Thu Feb 11 06:29:40 2016] HOST ALERT: QQ-RRR-MCLINK-SECONDARY;DOWN;SOFT;1;PING CRITICAL - Packet loss = 100%
[Thu Feb 11 06:31:55 2016] HOST ALERT: QQ-RRR-MCLINK-SECONDARY;DOWN;SOFT;2;PING CRITICAL - Packet loss = 100%
[Thu Feb 11 06:34:10 2016] HOST ALERT: QQ-RRR-MCLINK-SECONDARY;DOWN;HARD;3;PING CRITICAL - Packet loss = 100%
[Thu Feb 11 06:44:29 2016] HOST ALERT: QQ-RRR-MCLINK-SECONDARY;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 9.38 ms
[Thu Feb 11 06:47:41 2016] Auto-save of retention data completed successfully.
This is the host definition in object.cache, it is not dependent but is a master host of another one:

Code: Select all

define host {
	host_name	QQ-RRR-MCLINK-SECONDARY
	alias	QQ-RRR-MCLINK-SECONDARY
	address	172.16.29.100
	check_period	24x7
	check_command	check-ls-host-alive
	contacts	customercontact,reperibility-system-day,reperibility-network-day,lscontact,beppe
	notification_period	24x7
	initial_state	o
	importance	0
	check_interval	5.000000
	retry_interval	2.000000
	max_check_attempts	3
	active_checks_enabled	1
	passive_checks_enabled	1
	obsess	1
	event_handler_enabled	1
	low_flap_threshold	0.000000
	high_flap_threshold	0.000000
	flap_detection_enabled	0
	flap_detection_options	a
	freshness_threshold	0
	check_freshness	0
	notification_options	r,d
	notifications_enabled	1
	notification_interval	0.000000
	first_notification_delay	0.000000
	stalking_options	n
	process_perf_data	1
	retain_status_information	1
	retain_nonstatus_information	1
	}

define hostdependency {
	host_name	QQ-RRR-MCLINK-SECONDARY
	dependent_host_name	QQ-RRR-FIREWALL
	inherits_parent	1
	notification_failure_options	d,u
	}
Thank you.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: No notification on host down

Post by ssax »

You have notification_interval set to 0 in the template, if you set this value to 0, Nagios will not re-notify contacts about problems for this host - only one problem notification will be sent out. Are you sure no emails have been sent out? Can you try setting it to something else for testing to see if any of them get through?

Thank you
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: No notification on host down

Post by tgriep »

Can you remove that host from the Host Dependency and see if the notifications start to work?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Tron911
Posts: 7
Joined: Tue Feb 09, 2016 5:44 am

Re: No notification on host down

Post by Tron911 »

You found it (tgriep)!

@ ssax: the notification_interval was set to 0 voluntarily. I'm also sure that no notifications were sent before.

These are the logs involved in the test requested.

With dependencies:

Code: Select all

DOWN
Feb 16 08:49:09 zz-sed-monit01 nagios: EXTERNAL COMMAND: SCHEDULE_FORCED_HOST_CHECK;XX-YYY-STORAG01;1455608947
Feb 16 08:49:24 zz-sed-monit01 nagios: HOST ALERT: XX-YYY-STORAG01;DOWN;SOFT;1;PING CRITICAL - Packet loss = 100%
Feb 16 08:49:31 zz-sed-monit01 nagios: EXTERNAL COMMAND: SCHEDULE_FORCED_HOST_CHECK;XX-YYY-STORAG01;1455608970
Feb 16 08:49:47 zz-sed-monit01 nagios: HOST ALERT: XX-YYY-STORAG01;DOWN;SOFT;2;PING CRITICAL - Packet loss = 100%
Feb 16 08:49:59 zz-sed-monit01 nagios: EXTERNAL COMMAND: SCHEDULE_FORCED_HOST_CHECK;XX-YYY-STORAG01;1455608998
Feb 16 08:50:14 zz-sed-monit01 nagios: HOST ALERT: XX-YYY-STORAG01;DOWN;HARD;3;PING CRITICAL - Packet loss = 100%

UP
Feb 16 08:51:51 zz-sed-monit01 nagios: EXTERNAL COMMAND: SCHEDULE_FORCED_HOST_CHECK;XX-YYY-STORAG01;1455609109
Feb 16 08:51:55 zz-sed-monit01 nagios: HOST ALERT: XX-YYY-STORAG01;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 20.17 ms
Without dependencies:

Code: Select all

DOWN
Feb 16 09:39:30 zz-sed-monit01 nagios: EXTERNAL COMMAND: SCHEDULE_FORCED_HOST_CHECK;XX-YYY-STORAG01;1455611969
Feb 16 09:39:45 zz-sed-monit01 nagios: HOST ALERT: XX-YYY-STORAG01;DOWN;SOFT;1;PING CRITICAL - Packet loss = 100%
Feb 16 09:39:57 zz-sed-monit01 nagios: EXTERNAL COMMAND: SCHEDULE_FORCED_HOST_CHECK;XX-YYY-STORAG01;1455611996
Feb 16 09:40:12 zz-sed-monit01 nagios: HOST ALERT: XX-YYY-STORAG01;DOWN;SOFT;2;PING CRITICAL - Packet loss = 100%
Feb 16 09:41:18 zz-sed-monit01 nagios: EXTERNAL COMMAND: SCHEDULE_FORCED_HOST_CHECK;XX-YYY-STORAG01;1455612077
Feb 16 09:41:34 zz-sed-monit01 nagios: HOST ALERT: XX-YYY-STORAG01;DOWN;HARD;3;CRITICAL - Plugin timed out after 15 seconds
Feb 16 09:41:34 zz-sed-monit01 nagios: HOST NOTIFICATION: beppe;XX-YYY-STORAG01;DOWN;host-mail-noCC;CRITICAL - Plugin timed out after 15 seconds
Feb 16 09:41:34 zz-sed-monit01 nagios: HOST NOTIFICATION: lscontact;XX-YYY-STORAG01;DOWN;autoticket-ls;CRITICAL - Plugin timed out after 15 seconds
Feb 16 09:41:34 zz-sed-monit01 nagios: HOST NOTIFICATION: lscontact;XX-YYY-STORAG01;DOWN;host-mail-noCC;CRITICAL - Plugin timed out after 15 seconds
Feb 16 09:41:34 zz-sed-monit01 nagios: HOST NOTIFICATION: customercontact;XX-YYY-STORAG01;DOWN;customailerhost;CRITICAL - Plugin timed out after 15 seconds

UP
Feb 16 09:43:18 zz-sed-monit01 nagios: EXTERNAL COMMAND: SCHEDULE_FORCED_HOST_CHECK;XX-YYY-STORAG01;1455612197
Feb 16 09:43:22 zz-sed-monit01 nagios: HOST ALERT: XX-YYY-STORAG01;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 15.24 ms
Feb 16 09:43:22 zz-sed-monit01 nagios: HOST NOTIFICATION: beppe;XX-YYY-STORAG01;UP;host-mail-noCC;PING OK - Packet loss = 0%, RTA = 15.24 ms
Feb 16 09:43:22 zz-sed-monit01 nagios: HOST NOTIFICATION: lscontact;XX-YYY-STORAG01;UP;autoticket-ls;PING OK - Packet loss = 0%, RTA = 15.24 ms
Feb 16 09:43:22 zz-sed-monit01 nagios: HOST NOTIFICATION: lscontact;XX-YYY-STORAG01;UP;host-mail-noCC;PING OK - Packet loss = 0%, RTA = 15.24 ms
Feb 16 09:43:22 zz-sed-monit01 nagios: HOST NOTIFICATION: customercontact;XX-YYY-STORAG01;UP;customailerhost;PING OK - Packet loss = 0%, RTA = 15.24 ms

So what is the problem? Host dependencies are almost mandatory in the customer's network...

Thank you,
Giuseppe
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: No notification on host down

Post by rkennedy »

This is strange, can you please provide a couple files for us to review? They are -

Code: Select all

/usr/local/nagios/var/status.dat
/usr/local/nagios/var/objects.cache
With those we can take a deeper look at what's going on.
Former Nagios Employee
Tron911
Posts: 7
Joined: Tue Feb 09, 2016 5:44 am

Re: No notification on host down

Post by Tron911 »

Hello, I've attached the files requested: I took the files configuring the host with and without dependency.
Let me know if I can do something more.

Thank you,
Giuseppe
Attachments
Ticket.zip
File needed, with and without dependency.
(140.6 KiB) Downloaded 137 times
Locked