using contacts causing double email alert
-
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: using contacts causing double email alert
hata_ph, could you post the logs that millissa requested, you guys seem to be on a pretty good path here!
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Re: using contacts causing double email alert
sorry for the late reply...below is the log files
It send alert mail to itdba and itnetworksupport with below config
It send alert mail to itdba and itnetworksupport with below config
Code: Select all
# services template for DBA team
define service{
name generic-service-dba
use generic-service-template
contacts itdba
#contact_groups db-admins
}
- Attachments
-
- nagios.log
- (302.84 KiB) Downloaded 344 times
-
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: using contacts causing double email alert
Unfortunately that log is only showing 1 check for 6800SHDB - H Drive and a few checks for 6800SHDB-DR - H Drive, none of which caused an alert to be sent. Would you be willing to stop the snmp service on one of those two, let the alert come through, then send off the log again? Of course once the alert comes through make sure to start snmp backup again.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Re: using contacts causing double email alert
Pls check the last of the log...
Code: Select all
[1410754854] Caught SIGTERM, shutting down...
[1410754854] Successfully shutdown... (PID=7407)
[1410754854] Nagios 3.5.1 starting... (PID=20701)
[1410754854] Local time is Mon Sep 15 12:20:54 MYT 2014
[1410754854] LOG VERSION: 2.0
[1410754854] Finished daemonizing... (New PID=20702)
[1410754854] SERVICE FLAPPING ALERT: SIT-MBX02;Physical Memory;STARTED; Service appears to have started flapping (30.1% change >= 20.0% threshold)
[1410754854] SERVICE FLAPPING ALERT: SRMSHOST08A;Physical Memory;STARTED; Service appears to have started flapping (27.6% change >= 20.0% threshold)
[1410754924] SERVICE NOTIFICATION: itnetworksupport;6800SRETDB;F Drive;CRITICAL;notify-service-by-email;SNMP CRITICAL - F:\ Label:New Volume Serial Number 46939174 at 96% with 10,190 of 286,081 MB free
[1410754924] SERVICE NOTIFICATION: itdba;6800SRETDB;F Drive;CRITICAL;notify-service-by-email;SNMP CRITICAL - F:\ Label:New Volume Serial Number 46939174 at 96% with 10,190 of 286,081 MB free
-
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: using contacts causing double email alert
Would you be able to post your /var/log/maillog for the corresponding time period as well? Around timestamp "1410754924" so we can actually see the duplicate mail being created and sent out?
Re: using contacts causing double email alert
I have simulate the duplicate email alert today and have attached the log files. Pls check the last few line of log for both nagios and postfix
nagios.log
nagios.log
- Attachments
-
- nagios.log
- (225.37 KiB) Downloaded 329 times
Re: using contacts causing double email alert
For mail.log, i have change my email domain name to xxx.com for security reason
mail.log
mail.log
- Attachments
-
- mail.log
- (726.54 KiB) Downloaded 354 times
-
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: using contacts causing double email alert
Okay, so it definitely looks like something might be happening on the postfix end, since we can see them being duplicated there, or rather between nagios and the queue (at least to me, other eyes are welcome!).
What is the output of:
Can we get a copy of your "notify-service-by-email" command definition as well? Do you have something like "Spam assassin" or anything that would potentially deal with security audition or email interception that differs from what is shipped with core or postfix on their own set up? I'll edit this and post more questions as I dig further, unless you reply first. Thanks hata_ph!
What is the output of:
Code: Select all
postconf -n
Re: using contacts causing double email alert
slansing,
I am not sure it is cause by postfix as i mention before if i put contact_groups null with contacts itdba, no duplicate alert emails...
here are my postconf -n
here are my notification definition from commands.cfg
I am not sure it is cause by postfix as i mention before if i put contact_groups null with contacts itdba, no duplicate alert emails...
Code: Select all
# services template for DBA team
define service{
name generic-service-dba
use generic-service-template
contacts itdba
contact_groups null
#contact_groups db-admins
}
here are my postconf -n
Code: Select all
administrator@SIT-NAGIOS:/etc/nagios3/conf.d$ postconf -n
alias_database = hash:/etc/aliases
alias_maps = hash:/etc/aliases
append_dot_mydomain = no
biff = no
config_directory = /etc/postfix
inet_interfaces = all
mailbox_size_limit = 0
mydestination = SIT-NAGIOS, localhost.localdomain, , localhost
myhostname = SIT-NAGIOS
mynetworks = 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128
readme_directory = no
recipient_delimiter = +
relayhost = sit-smtp01.xxx.com
smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache
smtpd_banner = $myhostname ESMTP $mail_name (Ubuntu)
smtpd_relay_restrictions = permit_mynetworks permit_sasl_authenticated defer_unauth_destination
smtpd_tls_cert_file = /etc/ssl/certs/ssl-cert-snakeoil.pem
smtpd_tls_key_file = /etc/ssl/private/ssl-cert-snakeoil.key
smtpd_tls_session_cache_database = btree:${data_directory}/smtpd_scache
smtpd_use_tls = yes
Code: Select all
# 'notify-host-by-email' command definition
define command{
command_name notify-host-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /usr/bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
}
# 'notify-service-by-email' command definition
define command{
command_name notify-service-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | /usr/bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
}
Re: using contacts causing double email alert
When you set
in the final service definition (NOT a template that is getting included with the 'use' line), what actual email addresses receive a mail?
I ask because of this (I have moved the order some to make it clear) entry from the postfix log:
That is two separate notification emails being submitted by 'nagios', both at 8:42am localtime (which is GMT+8 based on the epochtime in the nagioslog). The first smtp submission with queue_id 668066008D only gets sent to itnetworksupport@. The second smtp submission with queue_id 8E979600B3 gets sent to two email addresses: itnetworksupport@ and itdba@.
When looking in the nagios log at 8:42a (00:42 GMT, or epoch 1410914529):
There are only 2 service notifications, which matches with the smtp submissions, but that doesn't account for one of those submissions going to multiple email addresses. The $CONTACTEMAIL$ macro should just contain the setting for 'email' from that specific contact entry. One of those notifications from nagios sent to 2 addresses for the very same submission (so effectively there were 3 mails). A contact group with 3 contacts as members would generate 3 separate and unique smtp submissions (each would have its own postfix queue_id) from nagios. The notify-service-by-email command definition appears to use the /usr/bin/mail and just has $CONTACTEMAIL$ at the end. Nothing with that would account for there being 2 email addresses in a single service notification submission.
Have you placed multiple email addresses in one of your contacts? You've been very diligent about removing the actual addresses before posting the configs, so this would be difficult for us to pick out.
This would be a different issue than 'why did 2 service notifications get generated' (which is still likely an inheritance issue; also your templates appear to not have register 0 in them again).
At first glance I thought you might have an aliases setup in postfix that was causing the one mail to get sent to two email addresses, however, the logs didn't show the 'orig_to' log line from postfix/local that usually accompanies that style of setup, and more importantly the log line has: nrcpt=2 . . . which means the mail was submitted with an original set of two recipients using a single /usr/bin/mail command.
Using the mail command like the following will cause a log in postfix like seen for queue_id 8E979600B3:
I just tested the following on one of my nagios 3 systems and I suspect you've done something similar:
This passes a config file check. When I force a service that uses that nagiosadmin contact into a notification state using your notification command, I see exactly what you posted in the postfix log. An nrcpt=2 with just a single NOTIFICATION in the nagios.log and both those emails getting a copy. You likely don't want to do this, even if it works; the email line as documented is for a single email address.
Code: Select all
contact_groups null
contacts itdba
I ask because of this (I have moved the order some to make it clear) entry from the postfix log:
Code: Select all
Sep 17 08:42:09 SIT-NAGIOS postfix/pickup[19373]: 668066008D: uid=106 from=<nagios>
Sep 17 08:42:09 SIT-NAGIOS postfix/cleanup[4639]: 668066008D: message-id=<20140917004209.668066008D@SIT-NAGIOS>
Sep 17 08:42:09 SIT-NAGIOS postfix/qmgr[976]: 668066008D: from=<nagios@SIT-NAGIOS>, size=629, nrcpt=1 (queue active)
Sep 17 08:42:09 SIT-NAGIOS postfix/smtp[4645]: 668066008D: to=<itnetworksupport@xxx.com>, relay=sit-smtp01.xxx.com[10.16.0.142]:25, delay=0.47, delays=0.11/0.07/0.02/0.28, dsn=2.6.0, status=sent (250 2.6.0 <20140917004209.668066008D@SIT-NAGIOS> Queued mail for delivery)
Sep 17 08:42:09 SIT-NAGIOS postfix/qmgr[976]: 668066008D: removed
Code: Select all
Sep 17 08:42:09 SIT-NAGIOS postfix/pickup[19373]: 8E979600B3: uid=106 from=<nagios>
Sep 17 08:42:09 SIT-NAGIOS postfix/cleanup[4639]: 8E979600B3: message-id=<20140917004209.8E979600B3@SIT-NAGIOS>
Sep 17 08:42:09 SIT-NAGIOS postfix/qmgr[976]: 8E979600B3: from=<nagios@SIT-NAGIOS>, size=649, nrcpt=2 (queue active)
Sep 17 08:42:09 SIT-NAGIOS postfix/smtp[4648]: 8E979600B3: to=<itdba@xxx.com>, relay=sit-smtp01.xxx.com[10.16.0.142]:25, delay=0.37, delays=0.01/0.04/0.09/0.22, dsn=2.6.0, status=sent (250 2.6.0 <20140917004209.8E979600B3@SIT-NAGIOS> Queued mail for delivery)
Sep 17 08:42:09 SIT-NAGIOS postfix/smtp[4648]: 8E979600B3: to=<itnetworksupport@xxx.com>, relay=sit-smtp01.xxx.com[10.16.0.142]:25, delay=0.37, delays=0.01/0.04/0.09/0.22, dsn=2.6.0, status=sent (250 2.6.0 <20140917004209.8E979600B3@SIT-NAGIOS> Queued mail for delivery)
Sep 17 08:42:09 SIT-NAGIOS postfix/qmgr[976]: 8E979600B3: removed
When looking in the nagios log at 8:42a (00:42 GMT, or epoch 1410914529):
Code: Select all
[1410914529] SERVICE NOTIFICATION: itnetworksupport;6800SRETDB;F Drive;CRITICAL;notify-service-by-email;SNMP CRITICAL - F:\ Label:New Volume Serial Number 46939174 at 96% with 10,190 of 286,081 MB free
[1410914529] SERVICE NOTIFICATION: itdba;6800SRETDB;F Drive;CRITICAL;notify-service-by-email;SNMP CRITICAL - F:\ Label:New Volume Serial Number 46939174 at 96% with 10,190 of 286,081 MB free
Have you placed multiple email addresses in one of your contacts? You've been very diligent about removing the actual addresses before posting the configs, so this would be difficult for us to pick out.
This would be a different issue than 'why did 2 service notifications get generated' (which is still likely an inheritance issue; also your templates appear to not have register 0 in them again).
At first glance I thought you might have an aliases setup in postfix that was causing the one mail to get sent to two email addresses, however, the logs didn't show the 'orig_to' log line from postfix/local that usually accompanies that style of setup, and more importantly the log line has: nrcpt=2 . . . which means the mail was submitted with an original set of two recipients using a single /usr/bin/mail command.
Using the mail command like the following will cause a log in postfix like seen for queue_id 8E979600B3:
Code: Select all
echo "Test body"|/usr/bin/mail -s "some subject" user1@example.com user2@example.com
Code: Select all
define contact{
contact_name nagiosadmin
use generic-contact
alias Nagios Admin
email millisa@example.com aaron@example.com
}