Page 1 of 2

Not received problem alert

Posted: Fri Feb 14, 2014 6:31 am
by plakshmi
Team,

For a particular service, we have not received problem alert but received recovery alert in Nagios core. Not sure what was the reason.
We verified the log. Log file shows it has sent problem alerts and recovery alerts.

Please let me know if you need any other details.

Re: Not received problem alert

Posted: Fri Feb 14, 2014 10:48 am
by tmcdonald
Can you post the config file for that service?

Also, moving your thread to General Support < Nagios Core since it doesn't involvement development.

Re: Not received problem alert

Posted: Fri Feb 14, 2014 11:41 am
by plakshmi
Here is the service config of that service.

define service {
use common-notify-service
host_name msp
service_description MSP Login Page
contact_groups UsaDcsEhAlerts,DMSSupportTeam,eBizSupportTeam
check_command common_check_https1!www.accounts.xerox.com!"Login"!/auth/login.jsf?app=MSP
}


define contactgroup {
contactgroup_name eBizSupportTeam
alias eBiz Support Team
members ebizsupport
}

define contactgroup {
contactgroup_name DMSSupportTeam
alias DMS Support Team
members jrudy,USADMSSupport
}



define contactgroup {
contactgroup_name UsaDcsEhAlerts
alias Usa Dcs Eh Alerts
members UsaDcsEhAlerts,msarney,mchenna
}



define contact {
contact_name UsaDcsEhAlerts
alias Usa Dcs Eh Alerts
email [email protected]
service_notification_period 24x7
host_notification_period 24x7
service_notification_options c,r
host_notification_options d,r
service_notification_commands notify-by-email,notify-by-epager
host_notification_commands host-notify-by-email,notify-by-epager
}


define timeperiod {
timeperiod_name 24x7
alias 24 Hours A Day, 7 Days A Week
sunday 00:00-24:00
monday 00:00-24:00
tuesday 00:00-24:00
wednesday 00:00-24:00
thursday 00:00-24:00
friday 00:00-24:00
saturday 00:00-24:00
}



define service {
name common-notify-service
use base-service
register 0
notifications_enabled 1
notification_interval 5
notification_period gepc-prod-hours
notification_options c,r
}


define service {
name base-service
register 0
is_volatile 0
max_check_attempts 10
normal_check_interval 5
retry_check_interval 1
active_checks_enabled 1
passive_checks_enabled 1
check_period 24x7
parallelize_check 1
obsess_over_service 1
check_freshness 0
event_handler_enabled 1
flap_detection_enabled 0
process_perf_data 0
retain_status_information 1
retain_nonstatus_information 1
}

Re: Not received problem alert

Posted: Fri Feb 14, 2014 11:55 am
by tmcdonald
In the following section

define service {
name common-notify-service
use base-service
register 0
notifications_enabled 1
notification_interval 5
notification_period gepc-prod-hours
notification_options c,r
}

you have the line

notification_options c,r

which will only notify/alert on Critical and Recovery, but not Warning. You will need to add 'w' to alert on Warning. What was the service state that was not sending alerts?

Re: Not received problem alert

Posted: Fri Feb 14, 2014 11:59 am
by slansing
But which service is causing the problems? Also, are you sure that your notification commands are set correctly on the contact you listed?:

You have:

Code: Select all

notify-by-email
In the service notification commands section, should it not be service-notify-by-email? If you modified these, let us know.

Re: Not received problem alert

Posted: Sun Feb 16, 2014 7:15 am
by plakshmi
Following is the service causing the problem (MSP Login Page).

define service {
use common-notify-service
host_name msp
service_description MSP Login Page
contact_groups UsaDcsEhAlerts,DMSSupportTeam,eBizSupportTeam
check_command common_check_https1!www.accounts.xerox.com!"Login"!/auth/login.jsf?app=MSP
}

It should be notify-by-email only. We have not modified anything in the service file.

Re: Not received problem alert

Posted: Mon Feb 17, 2014 11:44 am
by plakshmi
The original problem was that we are not receiving critical alerts only receiving recovery ones. If service notification commands are not correct we should not receive any recovery alerts either. The time period used is also 24X7 for both check and notification periods. Not sure how to go about this. Please let us know if you want any further information.

Re: Not received problem alert

Posted: Mon Feb 17, 2014 12:29 pm
by slansing
What is the output of the following:

Code: Select all

cat /usr/local/nagios/etc/commands.cfg | grep notify
Can you also share the configuration for the contact, and also the template assigned to that service?

Re: Not received problem alert

Posted: Tue Feb 18, 2014 11:35 am
by plakshmi
Here is the output of cat commands.cfg|grep notify

ip-10-244-159-111:/etc/nagios3# cat commands.cfg|grep notify
# 'notify-host-by-email' command definition
command_name notify-host-by-email
# 'notify-service-by-email' command definition
command_name notify-service-by-email
# 'notify-by-email' command definition
command_name notify-by-email
# 'notify-by-epager' command definition
command_name notify-by-epager
# 'host-notify-by-email' command definition
command_name host-notify-by-email
# 'host-notify-by-epager' command definition
command_name host-notify-by-epager
ip-10-244-159-111:/etc/nagios3#


Following is the configuration of the service and contact.

define service {
use common-notify-service
host_name msp
service_description MSP Login Page
contact_groups UsaDcsEhAlerts,DMSSupportTeam,eBizSupportTeam
check_command common_check_https1!www.accounts.xerox.com!"Login"!/auth/login.jsf?app=MSP
}



define contactgroup {
contactgroup_name UsaDcsEhAlerts
alias Usa Dcs Eh Alerts
members UsaDcsEhAlerts,msarney,mchenna
}



define contact {
contact_name UsaDcsEhAlerts
alias Usa Dcs Eh Alerts
email [email protected]
service_notification_period 24x7
host_notification_period 24x7
service_notification_options c,r
host_notification_options d,r
service_notification_commands notify-by-email,notify-by-epager
host_notification_commands host-notify-by-email,notify-by-epager
}

Re: Not received problem alert

Posted: Tue Feb 18, 2014 4:10 pm
by slansing
Plakshmi.. We still need to see the template (common-notify-service).