Page 1 of 2

notification problem

Posted: Wed Jul 03, 2013 7:37 am
by lafargeuser
Pasting Nagios alerts for your reference.

Actually, everything is okay, except I am not able to get recovery notification of only one host during once in a week on saturday.

i.e. On Saturday one job is scheduled on app server & for that specific timeperiod CPU breaches above critical value & then come down in OK status. I am receiving Critical Alert for CPU but same time when CPU come out from critical state, I should get recovery alert. However, I can see OK alert in Alert tab. But I am not sure, why I am unable to receive service notification for OK status ?

beow are my notification options :-
c,w,r


[06-29-2013 05:38:28] SERVICE ALERT: BGLRMSRMDB;CPU Load;OK;HARD;3;CPU Load 56% (5 min average)
[06-29-2013 05:23:28] SERVICE FLAPPING ALERT: BGLRMSRMDB;CPU Load;STARTED; Service appears to have started flapping (21.3% change >= 20.0% threshold)
[06-29-2013 05:23:28] SERVICE ALERT: BGLRMSRMDB;CPU Load;WARNING;HARD;3;CPU Load 82% (5 min average)
[06-29-2013 05:18:28] SERVICE ALERT: BGLRMSRMDB;CPU Load;CRITICAL;HARD;3;CPU Load 92% (5 min average)
[06-29-2013 05:16:28] SERVICE ALERT: BGLRMSRMDB;CPU Load;WARNING;SOFT;2;CPU Load 85% (5 min average)
[06-29-2013 05:14:28] SERVICE ALERT: BGLRMSRMDB;CPU Load;WARNING;SOFT;1;CPU Load 80% (5 min average)

Re: notification problem

Posted: Wed Jul 03, 2013 11:58 am
by sreinhardt
Are the notification options for your contact also set to recovery?

Re: notification problem

Posted: Thu Jul 04, 2013 12:59 am
by lafargeuser
Yes.

define contact{
contact_name remedy
use generic-contact
alias bmc
email [email protected]
service_notifications_enabled 1
host_notifications_enabled 1
service_notification_period 24x7
host_notification_period 24x7
service_notification_options c,w,r
host_notification_options d,u,r,n
}

Re: notification problem

Posted: Mon Jul 08, 2013 11:17 am
by abrist
It looks like this service was flapping, notifications will be disabled while it is flapping in order to reduce the flood of alerts that a flapping check can cause. If you want to be notified when it is flapping, add "f" to your notification options.

Code: Select all

f = send notifications when the host starts and stops flapping,

Re: notification problem

Posted: Tue Jul 16, 2013 7:37 am
by lafargeuser
Due to this I dont get recovery notification.

Basically, I have an Email integration of Nagios to Remedy (Service Desk tool) . If any critical alert comes ticket will raised in Remedy & recovery alert ticket will be autoclosed.
But due to service flapping,I am not able to get recovery Email notification. Because of that ticket is not getting auto closed.

Any solution for this ?

Re: notification problem

Posted: Tue Jul 16, 2013 10:23 am
by abrist

Re: notification problem

Posted: Wed Jul 17, 2013 1:22 am
by lafargeuser
Shall I disable the flap detection for this host ?

Re: notification problem

Posted: Wed Jul 17, 2013 10:33 am
by sreinhardt
You could, but this could lead to false positives and a lot of tickets. By adjusting the flapping levels instead, you can effectively still tell when it is actually having issues and when it may be flapping.

Re: notification problem

Posted: Thu Jul 18, 2013 1:22 am
by lafargeuser
By adjusting the flapping levels instead..

Actually, once in a week script runs on every saturday & at that point of time CPU spikes above threshold value.
I do get CPU Critical alerts. But when CPU coming in normal state, I should get OK alert which I am not getting. Due to that Ticket is not resolved.

Also, I increased threshold values of warning & critical alerts.

Re: notification problem

Posted: Thu Jul 18, 2013 11:30 am
by abrist
Is your contact set to be notified when a service enters a RECOVERY state?