using contacts causing double email alert

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: using contacts causing double email alert

Post by sreinhardt »

hata_ph, could you post the logs that millissa requested, you guys seem to be on a pretty good path here!
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
hata_ph
Posts: 31
Joined: Wed Aug 20, 2014 9:41 pm

Re: using contacts causing double email alert

Post by hata_ph »

sorry for the late reply...below is the log files

It send alert mail to itdba and itnetworksupport with below config

Code: Select all

# services template for DBA team
define service{
	name				generic-service-dba
	use				generic-service-template
	contacts			itdba
	#contact_groups			db-admins
}
Attachments
nagios.log
(302.84 KiB) Downloaded 335 times
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: using contacts causing double email alert

Post by sreinhardt »

Unfortunately that log is only showing 1 check for 6800SHDB - H Drive and a few checks for 6800SHDB-DR - H Drive, none of which caused an alert to be sent. Would you be willing to stop the snmp service on one of those two, let the alert come through, then send off the log again? Of course once the alert comes through make sure to start snmp backup again.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
hata_ph
Posts: 31
Joined: Wed Aug 20, 2014 9:41 pm

Re: using contacts causing double email alert

Post by hata_ph »

Pls check the last of the log...

Code: Select all

[1410754854] Caught SIGTERM, shutting down...
[1410754854] Successfully shutdown... (PID=7407)
[1410754854] Nagios 3.5.1 starting... (PID=20701)
[1410754854] Local time is Mon Sep 15 12:20:54 MYT 2014
[1410754854] LOG VERSION: 2.0
[1410754854] Finished daemonizing... (New PID=20702)
[1410754854] SERVICE FLAPPING ALERT: SIT-MBX02;Physical Memory;STARTED; Service appears to have started flapping (30.1% change >= 20.0% threshold)
[1410754854] SERVICE FLAPPING ALERT: SRMSHOST08A;Physical Memory;STARTED; Service appears to have started flapping (27.6% change >= 20.0% threshold)
[1410754924] SERVICE NOTIFICATION: itnetworksupport;6800SRETDB;F Drive;CRITICAL;notify-service-by-email;SNMP CRITICAL - F:\ Label:New Volume  Serial Number 46939174 at 96% with 10,190 of 286,081 MB free
[1410754924] SERVICE NOTIFICATION: itdba;6800SRETDB;F Drive;CRITICAL;notify-service-by-email;SNMP CRITICAL - F:\ Label:New Volume  Serial Number 46939174 at 96% with 10,190 of 286,081 MB free
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: using contacts causing double email alert

Post by slansing »

Would you be able to post your /var/log/maillog for the corresponding time period as well? Around timestamp "1410754924" so we can actually see the duplicate mail being created and sent out?
hata_ph
Posts: 31
Joined: Wed Aug 20, 2014 9:41 pm

Re: using contacts causing double email alert

Post by hata_ph »

I have simulate the duplicate email alert today and have attached the log files. Pls check the last few line of log for both nagios and postfix

nagios.log
Attachments
nagios.log
(225.37 KiB) Downloaded 327 times
hata_ph
Posts: 31
Joined: Wed Aug 20, 2014 9:41 pm

Re: using contacts causing double email alert

Post by hata_ph »

For mail.log, i have change my email domain name to xxx.com for security reason

mail.log
Attachments
mail.log
(726.54 KiB) Downloaded 342 times
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: using contacts causing double email alert

Post by slansing »

Okay, so it definitely looks like something might be happening on the postfix end, since we can see them being duplicated there, or rather between nagios and the queue (at least to me, other eyes are welcome!).

What is the output of:

Code: Select all

postconf -n
Can we get a copy of your "notify-service-by-email" command definition as well? Do you have something like "Spam assassin" or anything that would potentially deal with security audition or email interception that differs from what is shipped with core or postfix on their own set up? I'll edit this and post more questions as I dig further, unless you reply first. Thanks hata_ph!
hata_ph
Posts: 31
Joined: Wed Aug 20, 2014 9:41 pm

Re: using contacts causing double email alert

Post by hata_ph »

slansing,

I am not sure it is cause by postfix as i mention before if i put contact_groups null with contacts itdba, no duplicate alert emails...

Code: Select all

# services template for DBA team
define service{
   name            generic-service-dba
   use            generic-service-template
   contacts         itdba
   contact_groups null
   #contact_groups         db-admins
}

here are my postconf -n

Code: Select all

administrator@SIT-NAGIOS:/etc/nagios3/conf.d$ postconf -n
alias_database = hash:/etc/aliases
alias_maps = hash:/etc/aliases
append_dot_mydomain = no
biff = no
config_directory = /etc/postfix
inet_interfaces = all
mailbox_size_limit = 0
mydestination = SIT-NAGIOS, localhost.localdomain, , localhost
myhostname = SIT-NAGIOS
mynetworks = 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128
readme_directory = no
recipient_delimiter = +
relayhost = sit-smtp01.xxx.com
smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache
smtpd_banner = $myhostname ESMTP $mail_name (Ubuntu)
smtpd_relay_restrictions = permit_mynetworks permit_sasl_authenticated defer_unauth_destination
smtpd_tls_cert_file = /etc/ssl/certs/ssl-cert-snakeoil.pem
smtpd_tls_key_file = /etc/ssl/private/ssl-cert-snakeoil.key
smtpd_tls_session_cache_database = btree:${data_directory}/smtpd_scache
smtpd_use_tls = yes
here are my notification definition from commands.cfg

Code: Select all

# 'notify-host-by-email' command definition
define command{
        command_name    notify-host-by-email
        command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /usr/bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
        }

# 'notify-service-by-email' command definition
define command{
        command_name    notify-service-by-email
        command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | /usr/bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
        }
User avatar
millisa
Posts: 69
Joined: Thu Jan 16, 2014 11:13 pm
Location: Austin, TX
Contact:

Re: using contacts causing double email alert

Post by millisa »

When you set

Code: Select all

  contact_groups null
  contacts itdba
in the final service definition (NOT a template that is getting included with the 'use' line), what actual email addresses receive a mail?

I ask because of this (I have moved the order some to make it clear) entry from the postfix log:

Code: Select all

Sep 17 08:42:09 SIT-NAGIOS postfix/pickup[19373]: 668066008D: uid=106 from=<nagios>
Sep 17 08:42:09 SIT-NAGIOS postfix/cleanup[4639]: 668066008D: message-id=<20140917004209.668066008D@SIT-NAGIOS>
Sep 17 08:42:09 SIT-NAGIOS postfix/qmgr[976]: 668066008D: from=<nagios@SIT-NAGIOS>, size=629, nrcpt=1 (queue active)
Sep 17 08:42:09 SIT-NAGIOS postfix/smtp[4645]: 668066008D: to=<itnetworksupport@xxx.com>, relay=sit-smtp01.xxx.com[10.16.0.142]:25, delay=0.47, delays=0.11/0.07/0.02/0.28, dsn=2.6.0, status=sent (250 2.6.0  <20140917004209.668066008D@SIT-NAGIOS> Queued mail for delivery)
Sep 17 08:42:09 SIT-NAGIOS postfix/qmgr[976]: 668066008D: removed

Code: Select all

Sep 17 08:42:09 SIT-NAGIOS postfix/pickup[19373]: 8E979600B3: uid=106 from=<nagios>
Sep 17 08:42:09 SIT-NAGIOS postfix/cleanup[4639]: 8E979600B3: message-id=<20140917004209.8E979600B3@SIT-NAGIOS>
Sep 17 08:42:09 SIT-NAGIOS postfix/qmgr[976]: 8E979600B3: from=<nagios@SIT-NAGIOS>, size=649, nrcpt=2 (queue active)
Sep 17 08:42:09 SIT-NAGIOS postfix/smtp[4648]: 8E979600B3: to=<itdba@xxx.com>, relay=sit-smtp01.xxx.com[10.16.0.142]:25, delay=0.37, delays=0.01/0.04/0.09/0.22, dsn=2.6.0, status=sent (250 2.6.0  <20140917004209.8E979600B3@SIT-NAGIOS> Queued mail for delivery)
Sep 17 08:42:09 SIT-NAGIOS postfix/smtp[4648]: 8E979600B3: to=<itnetworksupport@xxx.com>, relay=sit-smtp01.xxx.com[10.16.0.142]:25, delay=0.37, delays=0.01/0.04/0.09/0.22, dsn=2.6.0, status=sent (250 2.6.0  <20140917004209.8E979600B3@SIT-NAGIOS> Queued mail for delivery)
Sep 17 08:42:09 SIT-NAGIOS postfix/qmgr[976]: 8E979600B3: removed
That is two separate notification emails being submitted by 'nagios', both at 8:42am localtime (which is GMT+8 based on the epochtime in the nagioslog). The first smtp submission with queue_id 668066008D only gets sent to itnetworksupport@. The second smtp submission with queue_id 8E979600B3 gets sent to two email addresses: itnetworksupport@ and itdba@.

When looking in the nagios log at 8:42a (00:42 GMT, or epoch 1410914529):

Code: Select all

[1410914529] SERVICE NOTIFICATION: itnetworksupport;6800SRETDB;F Drive;CRITICAL;notify-service-by-email;SNMP CRITICAL - F:\ Label:New Volume  Serial Number 46939174 at 96% with 10,190 of 286,081 MB free
[1410914529] SERVICE NOTIFICATION: itdba;6800SRETDB;F Drive;CRITICAL;notify-service-by-email;SNMP CRITICAL - F:\ Label:New Volume  Serial Number 46939174 at 96% with 10,190 of 286,081 MB free
There are only 2 service notifications, which matches with the smtp submissions, but that doesn't account for one of those submissions going to multiple email addresses. The $CONTACTEMAIL$ macro should just contain the setting for 'email' from that specific contact entry. One of those notifications from nagios sent to 2 addresses for the very same submission (so effectively there were 3 mails). A contact group with 3 contacts as members would generate 3 separate and unique smtp submissions (each would have its own postfix queue_id) from nagios. The notify-service-by-email command definition appears to use the /usr/bin/mail and just has $CONTACTEMAIL$ at the end. Nothing with that would account for there being 2 email addresses in a single service notification submission.

Have you placed multiple email addresses in one of your contacts? You've been very diligent about removing the actual addresses before posting the configs, so this would be difficult for us to pick out.

This would be a different issue than 'why did 2 service notifications get generated' (which is still likely an inheritance issue; also your templates appear to not have register 0 in them again).

At first glance I thought you might have an aliases setup in postfix that was causing the one mail to get sent to two email addresses, however, the logs didn't show the 'orig_to' log line from postfix/local that usually accompanies that style of setup, and more importantly the log line has: nrcpt=2 . . . which means the mail was submitted with an original set of two recipients using a single /usr/bin/mail command.

Using the mail command like the following will cause a log in postfix like seen for queue_id 8E979600B3:

Code: Select all

echo "Test body"|/usr/bin/mail -s "some subject" user1@example.com user2@example.com
I just tested the following on one of my nagios 3 systems and I suspect you've done something similar:

Code: Select all

define contact{
        contact_name                    nagiosadmin
        use                             generic-contact
        alias                           Nagios Admin
        email                           millisa@example.com aaron@example.com
        }
This passes a config file check. When I force a service that uses that nagiosadmin contact into a notification state using your notification command, I see exactly what you posted in the postfix log. An nrcpt=2 with just a single NOTIFICATION in the nagios.log and both those emails getting a copy. You likely don't want to do this, even if it works; the email line as documented is for a single email address.
Locked