Email Notification question

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
stecino
Posts: 248
Joined: Thu Mar 14, 2013 4:42 pm

Email Notification question

Post by stecino »

Hi All,

I am trying to separate the notification emails for prod and stage servers/services. What I did was to define notification commands fore each environment, separate contact definitions,separate host/service template for each environment.
But for some reason when I get an alert notification for prod server/service, there is a duplicate email for the same server/service alert with a different email heading that points to stage definition. It's very weird, this is what I have

commands


Prod
------
# 'notify-host-by-email' command definition
define command{
command_name notify-host-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $LONGHOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /bin/mail -s " $HOSTNAME$ is $HOSTSTATE$ " $CONTACTEMAIL$
}

# 'notify-service-by-email' command definition
define command{
command_name notify-service-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nComment: $SERVICEACKCOMMENT$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n$LONGSERVICEOUTPUT$" | /bin/mail -s "$HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ " $CONTACTEMAIL$
}


Stage
-------
# 'notify-host-by-email-stage' command definition
define command{
command_name notify-host-by-email-stage
command_line /usr/bin/printf "%b" "***** Nagios Stage Alert*****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $LONGHOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /bin/mail -s " $HOSTNAME$ is $HOSTSTATE$ " $CONTACTEMAIL$
}

# 'notify-service-by-email-stage' command definition
define command{
command_name notify-service-by-email-stage
command_line /usr/bin/printf "%b" "***** Nagios Stage Alert*****\n\nNotification Type: $NOTIFICATIONTYPE$\nComment: $SERVICEACKCOMMENT$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n$LONGSERVICEOUTPUT$" | /bin/mail -s "$HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ " $CONTACTEMAIL$
}


Contact definitions
----------------------

prod

define contact{
name generic-contact ; The name of this contact template
service_notification_period 24x7 ; service notifications can be sent anytime
host_notification_period 24x7 ; host notifications can be sent anytime
service_notification_options w,u,c,r,f,s ; send notifications for all service states, flapping events, and scheduled downtime events
host_notification_options d,u,r,f,s ; send notifications for all host states, flapping events, and scheduled downtime events
service_notification_commands notify-service-by-email ; send service notifications via email
host_notification_commands notify-host-by-email ; send host notifications via email
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL CONTACT, JUST A TEMPLATE!
}

stage

define contact{
name stage-generic-contact ; The name of this contact template
service_notification_period 24x7 ; service notifications can be sent anytime
host_notification_period 24x7 ; host notifications can be sent anytime
service_notification_options w,u,c,r,f,s ; send notifications for all service states, flapping events, and scheduled downtime events
host_notification_options d,u,r,f,s ; send notifications for all host states, flapping events, and scheduled downtime events
service_notification_commands notify-service-by-email-stage ; send service notifications via email
host_notification_commands notify-host-by-email-stage ; send host notifications via email
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL CONTACT, JUST A TEMPLATE!
}

prod

#######################################################
################# ALERTS SYSTEM #####################
#######################################################

define contact{
use generic-contact
contact_name alerts_system
alias Alerts System
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,u,r
email alerts-system@blah.com
}

stage

#######################################################
################# ALERTS SYSTEM STAGING ##############
#######################################################

define contact{
use stage-generic-contact
contact_name stage-alerts_system
alias Stage-Alerts System
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,u,r
email alerts-system@blah.com
}

define contactgroup{
contactgroup_name apps
alias Application Server Administrators
members alerts_system
}

define contactgroup{
contactgroup_name stage-apps
alias Stage Application Server Administrators
members stage-alerts_system
}


For service I have this template and service defintion

define service{
name mniv-services
active_checks_enabled 1
passive_checks_enabled 1
parallelize_check 1
obsess_over_service 0
check_freshness 0
notifications_enabled 1
notification_interval 60
notification_period 24x7
notification_options w,u,c,r
contact_groups apps
event_handler_enabled 1
flap_detection_enabled 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
is_volatile 0
max_check_attempts 3
check_period 24x7
normal_check_interval 3
retry_check_interval 1
action_url /graphs/graph?host=$HOSTNAME$&srv=$SERVICEDESC$' class='tips' rel='/graphs/popup?host=$HOSTNAME$&srv=$SERVICEDESC$'
register 0
}


define service {
use mniv-services
hostgroup_name MNIV
service_description LOAD
check_command check_nrpe!check_load
}

Contact group is set for apps.

So when there is an alert in one of the servers of this MNIV group, I get the following emails, one with stage header, the one without. But since this is a prod server, I shouldn't see the one with Stage in the header.

***** Nagios Stage Alert*****

Notification Type: PROBLEM
Comment:

Service: LOAD
Host: us1map03
Address: xx.xx.4.103
State: WARNING

Date/Time: Mon Aug 4 10:25:15 PDT 2014

Additional Info:

WARNING - load average: 9.58, 10.41, 10.31


***** Nagios*****

Notification Type: PROBLEM
Comment:

Service: LOAD
Host: us1map03
Address: xx.xx.4.103
State: WARNING

Date/Time: Mon Aug 4 10:25:15 PDT 2014

Additional Info:

WARNING - load average: 9.58, 10.41, 10.31
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Email Notification question

Post by Box293 »

Can you show us the host definition for us1map03 AND any template definitions it is using.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
stecino
Posts: 248
Joined: Thu Mar 14, 2013 4:42 pm

Re: Email Notification question

Post by stecino »

Here it is

define host{
name mniv-hosts ; The name of this host template
check_period 24x7 ; By default, Linux hosts are checked round the clock
check_interval 5 ; Actively check the host every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 3 ; Check each Linux host 10 times (max)
check_command check-host-alive; Default command to check Linux hosts
contact_groups apps
notification_period 24x7 ; Linux admins hate to be woken up, so we only notify during the day
notification_interval 60 ; Resend notifications every 2 hours
notification_options d,u,r ; Only send notifications for specific host states
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}

define host{
use mniv-hosts
host_name US1MAP03
alias us1map03
address xx.xx.4.103
}
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Email Notification question

Post by Box293 »

Nothing obvious sticks out.

I suggest you turn on debug logging and see what happens.
In your nagios.cfg please set this to -1

Code: Select all

debug_level=-1
Restart Nagios.

Watch /usr/local/nagios/var/nagios.debug and see what is logged when these dual emails are sent.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
stecino
Posts: 248
Joined: Thu Mar 14, 2013 4:42 pm

Re: Email Notification question

Post by stecino »

Is it possible that this is a bug? I have different notification commands, my contact definition uses are separate, contact groups are different.
Somehow production contact group is referencing both notification commands, and stage contact group references only one notification command.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Email Notification question

Post by Box293 »

Everything is possible!

We need to investigate to find out what is happening, debug logs will help!
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
stecino
Posts: 248
Joined: Thu Mar 14, 2013 4:42 pm

Re: Email Notification question

Post by stecino »

Here is a snippet from the log file

[1407952514] SERVICE NOTIFICATION: alerts_system;US1NRDB03;DISK SPACE;WARNING;notify-service-by-email;DISK WARNING - free space: / 138843 MB (96% inode=99%): /mongodbdata 216723 MB (20% inode=99%): /journal 96508 MB (95% inode=99%):
[1407952514] SERVICE NOTIFICATION: stage-alerts_system;US1NRDB03;DISK SPACE;WARNING;service_email_stage;DISK WARNING - free space: / 138843 MB (96% inode=99%): /mongodbdata 216723 MB (20% inode=99%): /journal 96508 MB (95% inode=99%):

First line is fine, second line should only be logged if only it's a stage server. In this case it's not. As you can see it is referencing different contact groups and different commands for some reason, but sends out for the same host. Would a duplicate service definition cause this?
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Email Notification question

Post by Box293 »

Can you post the whole log from when the service triggers a warning state and all the commands that take place up to and including the double notification.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
stecino
Posts: 248
Joined: Thu Mar 14, 2013 4:42 pm

Re: Email Notification question

Post by stecino »

I made a change that fixed the issue, this could even be a bug in nagios, still no clue why this could be the case

I had contact groups apps and stage-apps, somehow the stage-apps was being used as well, although none of the production definitions inherited from a template that would ave stage-apps as contact group.
what I did was to changed stage-apps to stageApps. This way there was more pattern difference. This worked

But again, it's still very weird, I have never seen this problem before, although I only had to work with one set of host and service email notification commands, in my case I have two sets stage and prod. Even though they are separated and should have worked with any headaches
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Email Notification question

Post by slansing »

Honestly I've not heard of this happening before, but we will keep an eye out for similar issues in the future. What version of Core was this on?
Locked