Host/Service escalation trick...
Posted: Wed Jan 30, 2013 12:42 am
I don't know if anyone has done this - it's a hard topic to search on, and I wasn't able to find anyone who addressed this issue - so i thought I'd post it here.
Essentialy the problem I had was:
Normally I'd generate three alerts, say every 5 minutes, [This was defined as a service or host escalation #1] - then no more alerts for this escalation.
Then I wanted one alert, say every hour - forever. [This is escalation #2]
But I also didn't want some alerts, say from midnight to 6a.
BUT...
When the new "day" started at 6a, I wanted alerts immediately for any services that were down - but with the above, I'd get an alert within one hour, but not right at the start of the day.
Example: Say a service went down at 2:59a. Normally I'd get three initial alerts (from escalation #1), but I'm filtering with a time period, so I won't get those inital three alerts, or the hourly - until the next 60 minute period, after 6a, rolls around -(from escalation #2). [i.e. 3:59, 4:59, 5:59 and 6:59 I won't get the 3:59a-5:59a alerts. And I will get the 6:59 alert - but I want to know the service was down at 6a, not nearly 7a!]
So, here's my solution.
Define another time period. In my case - it's from 6:00-6:15a, and for each day you want to get alerts.
[I actually have several, depending on when I want the period to start, say at 4, 6a or 8a]
Then define a third host/service escalation.
This one has a "last_notification 0"
I have the notification interval set to 5m
So, this escalation will generate alerts every 5m, for forever.
I set the escalation_period to that 6:00-6:15a time-period.
So, the notification_period will only let the notifications go out for 15 minutes each day - and I've set that time-period to be the first 15m of the larger period.
The result is: I get three alerts from any down service right at the beginning of any larger period, and then they "stop" [or, more accurately, are squelched] until tomorrow - if they're still down.
Hope that helps someone else - and perhaps this could go in a FAQ somewhere. [Provided this is novel - but I certainly wasn't able to find it with my Google-fu, or any other searching I did.]
-Greg
Keywords: notifications, service escalation, host escalation, initial notifications at the beginning of a period, escalate notification
Essentialy the problem I had was:
Normally I'd generate three alerts, say every 5 minutes, [This was defined as a service or host escalation #1] - then no more alerts for this escalation.
Then I wanted one alert, say every hour - forever. [This is escalation #2]
But I also didn't want some alerts, say from midnight to 6a.
BUT...
When the new "day" started at 6a, I wanted alerts immediately for any services that were down - but with the above, I'd get an alert within one hour, but not right at the start of the day.
Example: Say a service went down at 2:59a. Normally I'd get three initial alerts (from escalation #1), but I'm filtering with a time period, so I won't get those inital three alerts, or the hourly - until the next 60 minute period, after 6a, rolls around -(from escalation #2). [i.e. 3:59, 4:59, 5:59 and 6:59 I won't get the 3:59a-5:59a alerts. And I will get the 6:59 alert - but I want to know the service was down at 6a, not nearly 7a!]
So, here's my solution.
Define another time period. In my case - it's from 6:00-6:15a, and for each day you want to get alerts.
[I actually have several, depending on when I want the period to start, say at 4, 6a or 8a]
Then define a third host/service escalation.
This one has a "last_notification 0"
I have the notification interval set to 5m
So, this escalation will generate alerts every 5m, for forever.
I set the escalation_period to that 6:00-6:15a time-period.
So, the notification_period will only let the notifications go out for 15 minutes each day - and I've set that time-period to be the first 15m of the larger period.
The result is: I get three alerts from any down service right at the beginning of any larger period, and then they "stop" [or, more accurately, are squelched] until tomorrow - if they're still down.
Hope that helps someone else - and perhaps this could go in a FAQ somewhere. [Provided this is novel - but I certainly wasn't able to find it with my Google-fu, or any other searching I did.]
-Greg
Keywords: notifications, service escalation, host escalation, initial notifications at the beginning of a period, escalate notification