Host or Service Check Interval (SOLVED)
Posted: Tue Aug 16, 2016 10:43 am
I'm trying to figure out how can I check a service and/or a host every 20 secs, then re-try every 10 secs, only to send a notification after 3 retries.
This is what I got:
Template used:
define host{
name host-services ; The name of this host template
check_period extendhours ; By default, switches are monitored round the clock
check_interval 0.30 ; Switches are checked every 5 minutes
retry_interval 0.20 ; Schedule host check retries at 1 minute intervals
max_check_attempts 3 ; Check each switch 10 times (max)
check_command check-host-alive ; Default command to check if routers are "alive"
notification_interval 0 ; Resend notifications every 30 minutes
notification_options d,r,u ; Only send notifications for specific host states
contact_groups admins ; Notifications get sent to the admins by default
notifications_enabled 1 ; Host notifications are enabled
event_handler_enabled 1 ; Host event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
notification_period extendhours
register 0 ; DONT REGISTER THIS - ITS JUST A TEMPLATE
}
Host config:
define host{
use host-services ; Inherit default values from a template
host_name laptop ; The name we're giving to this host
alias Laptop ; A longer name associated with the host
address 10.2.10.166 ; IP address of the host
active_checks_enabled 1
}
And this is what I get in the logs. First, the moment I unplug the laptop, it takes about 40-50 secs for the first SOFT 1 to show up. Then it looks like second re-try is 36 secs later, then 3rd re-try 36 secs again. Why does it take 40-50 secs to show up as SOFT down, and then 36 seconds for every re-try?
[08-16-2016 11:30:22] HOST ALERT: laptop;DOWN;HARD;3;PING CRITICAL - Packet loss = 100%
[08-16-2016 11:29:46] HOST ALERT: laptop;DOWN;SOFT;2;PING CRITICAL - Packet loss = 100%
[08-16-2016 11:29:10] HOST ALERT: laptop;DOWN;SOFT;1;PING CRITICAL - Packet loss = 100%
Thank you very much for your support.
This is what I got:
Template used:
define host{
name host-services ; The name of this host template
check_period extendhours ; By default, switches are monitored round the clock
check_interval 0.30 ; Switches are checked every 5 minutes
retry_interval 0.20 ; Schedule host check retries at 1 minute intervals
max_check_attempts 3 ; Check each switch 10 times (max)
check_command check-host-alive ; Default command to check if routers are "alive"
notification_interval 0 ; Resend notifications every 30 minutes
notification_options d,r,u ; Only send notifications for specific host states
contact_groups admins ; Notifications get sent to the admins by default
notifications_enabled 1 ; Host notifications are enabled
event_handler_enabled 1 ; Host event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
notification_period extendhours
register 0 ; DONT REGISTER THIS - ITS JUST A TEMPLATE
}
Host config:
define host{
use host-services ; Inherit default values from a template
host_name laptop ; The name we're giving to this host
alias Laptop ; A longer name associated with the host
address 10.2.10.166 ; IP address of the host
active_checks_enabled 1
}
And this is what I get in the logs. First, the moment I unplug the laptop, it takes about 40-50 secs for the first SOFT 1 to show up. Then it looks like second re-try is 36 secs later, then 3rd re-try 36 secs again. Why does it take 40-50 secs to show up as SOFT down, and then 36 seconds for every re-try?
[08-16-2016 11:30:22] HOST ALERT: laptop;DOWN;HARD;3;PING CRITICAL - Packet loss = 100%
[08-16-2016 11:29:46] HOST ALERT: laptop;DOWN;SOFT;2;PING CRITICAL - Packet loss = 100%
[08-16-2016 11:29:10] HOST ALERT: laptop;DOWN;SOFT;1;PING CRITICAL - Packet loss = 100%
Thank you very much for your support.