Page 2 of 3
Re: Multiple events created for the same alert.
Posted: Fri Jul 10, 2020 7:05 am
by typer100
Hi! They are duplicate only for the content. The timestamp is different. Within few seconds.
Another random notification I got this morning:
Timestamp 1st email:
OK: Used disk space was 47.70 % (Used: 1537.44 MB, Free: 1683.79 MB, Total: 3221.23 MB)
Date/Time: 2020-07-10 07:05:51
Timestamp last one:
OK: Used disk space was 47.70 % (Used: 1537.44 MB, Free: 1683.79 MB, Total: 3221.23 MB)
Date/Time: 2020-07-10 07:06:00
It affects not only one account. I did send the email like you asked, and I got the 10 emails in personal email inbox and our support group inbox (support.aix).
Both emails are in the same contact group. I use templates for the alerting.
This is the template I use for the host mangenent: SAQ_AIX_nonprod_hosts
*** NOTE: I will be on vacation for the next couple weeks. You won't receive any updates. ***
Re: Multiple events created for the same alert.
Posted: Fri Jul 10, 2020 2:40 pm
by benjaminsmith
Hi,
Looking over the information so more, I noticed that the Support AIX does not have a direct email defined in the contact definition.
Code: Select all
define contact {
contact_name Support AIX
alias Support Aix
service_notification_period support.aix_notification_times
host_notification_period support.aix_notification_times
service_notification_options r,w,u,c,f,s
host_notification_options r,d,u,f,s
service_notification_commands xi_service_notification_handler
host_notification_commands xi_host_notification_handler
minimum_importance 0
host_notifications_enabled 1
service_notifications_enabled 1
can_submit_commands 1
retain_status_information 1
retain_nonstatus_information 1
}
And looking at the host template, there is one contact, opsgenie, and one contactgroup, xi_unix_contact_group, for this host. I noticed opsgenie is using a custom notification handlers. The other contacts are using the standard Nagios XI notification commands.
Code: Select all
define command {
command_name notify-service-by-opsgenie
command_line /usr/bin/nagios2opsgenie -entityType=service -t="$NOTIFICATIONTYPE$" -ldt="$LONGDATETIME$" -hn="$HOSTNAME$" -hdn="$HOSTDISPLAYNAME$" -hal="$HOSTALIAS$" -haddr="$HOSTADDRESS$" -hs="$HOSTSTATE$" -hsi="$HOSTSTATEID$" -lhs="$LASTHOSTSTATE$" -lhsi="$LASTHOSTSTATEID$" -hst="$HOSTSTATETYPE$" -ha="$HOSTATTEMPT$" -mha="$MAXHOSTATTEMPTS$" -hei="$HOSTEVENTID$" -lhei="$LASTHOSTEVENTID$" -hpi="$HOSTPROBLEMID$" -lhpi="$LASTHOSTPROBLEMID$" -hl="$HOSTLATENCY$" -het="$HOSTEXECUTIONTIME$" -hd="$HOSTDURATION$" -hds="$HOSTDURATIONSEC$" -hdt="$HOSTDOWNTIME$" -hpc="$HOSTPERCENTCHANGE$" -hgn="$HOSTGROUPNAME$" -hgns="$HOSTGROUPNAMES$" -lhc="$LASTHOSTCHECK$" -lhsc="$LASTHOSTSTATECHANGE$" -lhu="$LASTHOSTUP$" -lhd="$LASTHOSTDOWN$" -lhur="$LASTHOSTUNREACHABLE$" -ho="$HOSTOUTPUT$" -lho="$LONGHOSTOUTPUT$" -hpd="$HOSTPERFDATA$" -s="$SERVICEDESC$" -sdn="$SERVICEDISPLAYNAME$" -ss="$SERVICESTATE$" -ssi="$SERVICESTATEID$" -lss="$LASTSERVICESTATE$" -lssi="$LASTSERVICESTATEID$" -sst="$SERVICESTATETYPE$" -sa="$SERVICEATTEMPT$" -msa="$MAXSERVICEATTEMPTS$" -siv="$SERVICEISVOLATILE$" -sei="$SERVICEEVENTID$" -lsei="$LASTSERVICEEVENTID$" -spi="$SERVICEPROBLEMID$" -lspi="$LASTSERVICEPROBLEMID$" -sl="$SERVICELATENCY$" -set="$SERVICEEXECUTIONTIME$" -sd="$SERVICEDURATION$" -sds="$SERVICEDURATIONSEC$" -sdt="$SERVICEDOWNTIME$" -spc="$SERVICEPERCENTCHANGE$" -sgn="$SERVICEGROUPNAME$" -sgns="$SERVICEGROUPNAMES$" -lsch="$LASTSERVICECHECK$" -lssc="$LASTSERVICESTATECHANGE$" -lsok="$LASTSERVICEOK$" -lsw="$LASTSERVICEWARNING$" -lsu="$LASTSERVICEUNKNOWN$" -lsc="$LASTSERVICECRITICAL$" -so="$SERVICEOUTPUT$" -lso="$LONGSERVICEOUTPUT$" -spd="$SERVICEPERFDATA$" -teams=$_SERVICEOPSGENIETEAMS$
Can you temporarily remove this contact from the host and test a custome notification once more as I'm not sure how this handler is setup.
Re: Multiple events created for the same alert.
Posted: Mon Aug 03, 2020 10:28 am
by typer100
Hi! I'm back from vacation. I did both changes. Same issue. Multiple emails. Sending new profile.
Moderator's Note: The profile has been shared with the support team but has been removed from the public forum.
Re: Multiple events created for the same alert.
Posted: Tue Aug 04, 2020 12:05 pm
by tgriep
The configs from the profile look good but I have an idea.
You are running Mod Gearman and I bet that it has notifications enabled in the Gearman module.conf file. That needs to be disabled if it is enabled.
Edit this file
Set this option to no.
notifications=no
Save and restart nagios so the change can take effect.
I am thinking multiple workers are running the same notification command. That is why it should be disabled.
Re: Multiple events created for the same alert.
Posted: Tue Aug 04, 2020 12:25 pm
by typer100
Hi! I've modifies the conf file and restart Nagios. Same issue.
Re: Multiple events created for the same alert.
Posted: Wed Aug 05, 2020 1:37 pm
by benjaminsmith
Hi
@typer100,
We've ruled out the nagios configurations, nagios process, notification handler, and Mod Gearman setup, so it's likely that mail server is sending multiple notifications.
To confirm, when you receive multiple messages can copy and paste the full message with the
email header to a txt file and add it to the thread, so we can figure out which server is sending multiple notifications. Thanks for your patience on this, Benjamin
Re: Multiple events created for the same alert.
Posted: Wed Aug 05, 2020 2:15 pm
by typer100
I've copied the full email for the first 3 I got. If you want all ten, let me know.
Re: Multiple events created for the same alert.
Posted: Thu Aug 06, 2020 2:29 pm
by tgriep
Try this, go to the Admin > Mail Settings menu and in the SMTP Host file, put in the IP address of the SMTP server instead of the hostname.
Use this 198.140.134.10
Save the change and see if the multiple emails stop.
Re: Multiple events created for the same alert.
Posted: Fri Aug 07, 2020 8:09 am
by typer100
Hi! Emails are not going out. This is what I got from the /usr/local/nagiosxi/tmp/phpmailer.log.
[08-07-2020 08:47:02] SMTP connect() failed. https://github.com/PHPMailer/PHPMailer/ ... leshooting (method=smtp;host=198.140.134.10;port=25;security=tls), Referer: includes/components/xicore/xicore.inc.php > Event Handler Notification Email
Re: Multiple events created for the same alert.
Posted: Fri Aug 07, 2020 8:35 am
by scottwilkerson
typer100 wrote:Hi! Emails are not going out. This is what I got from the /usr/local/nagiosxi/tmp/phpmailer.log.
[08-07-2020 08:47:02] SMTP connect() failed. https://github.com/PHPMailer/PHPMailer/ ... leshooting (method=smtp;host=198.140.134.10;port=25;security=tls), Referer: includes/components/xicore/xicore.inc.php > Event Handler Notification Email
Can you confirm you have the correct setting for the server 198.140.134.10 setup in Admin -> Email Settings
Port
Security type
Username/password