here is an example of a PING check on one of our servers
Code: Select all
define service {
host_name yuma-065wm
service_description PING
is_volatile 0
check_command check_ping!200,20%!500,60%
max_check_attempts 5
normal_check_interval 5
retry_check_interval 5
passive_checks_enabled 1
check_period 24x7
parallelize_check 1
obsess_over_service 1
check_freshness 1
event_handler_enabled 1
flap_detection_enabled 1
process_perf_data 0
retain_status_information 1
retain_nonstatus_information 1
contact_groups novell-admins-Email
active_checks_enabled 0
notification_interval 120
notification_period 24x7
notification_options w,c,u,r,f
notifications_enabled 1
register 1
}
If a check changes from a OK state to a non-OK state, (soft) the checks will occur every 5 minutes (retry_check_interval 5) 5 more times (max_check_attempts 5)
If the service stays in a non-ok state (25min later), an Alert will be sent out and the checking of the service will happen every 5 minutes again because the state has changed again (retry_check_interval 5) and if it stays in that Hard non-OK state the checks will again do a retry_check_interval every 5 and if if the state stays Hard non-ok revert back to a normal_check_interval occurring every 5 minutes and alert every 5 minutes.
Is that right?
Then, according to my escalation rule
Code: Select all
define serviceescalation {
hostgroup_name Branch_Servers
service_description PING,LVM,Load,NSS,NTPD Syncing,RAID Status,San Storage
contact_groups admins-SMS
first_notification 2
last_notification 6
notification_interval 60
escalation_period Branch_Servers
escalation_options u,c,r
I will get an escalation alert on the second alert (which would be 10 minutes after the state change) then every 25 minutes after but only on the second and third alert. 25min and 50min
So I'll get a level one alert at 25min and every 5 min until recovery.
AND an escalation alert at 25min and no others because the notification_interval of the escalation is 60 and the 4/5/6 alerts are before the 60min notification_interval.
If that is incorrect, what is the "notification_interval 60" mean? And what should it realistically be set to?
If I have any of the wrong please correct as necessary. I'm trying to wrap my head around this.
Thanks