I have just recently set up a new Nagios system using 4.0.8 with passive service checking only. This is working as expected, passive alerts are being received etc, but I appear to be having a problem with stale results.
I would like to set up the system so that when a check goes stale, a command is run to mark the service with a WARNING state, and the message "no recent passive updates received". Below I have attempted to explain the issue in more detail, and provided all the necessary configuration:
a) After three minutes (for this particular check) the results turn stale due to receiving no passive update:
Code: Select all
[1442935266] Warning: The results of service 'System-Memory' on host 'testunit' are stale by 0d 0h 0m 34s (threshold=0d 0h 3m 20s). I'm forcing an immediate check of the service.Code: Select all
# no_recent_passive definition
define command {
command_name no_recent_passive
command_line /usr/local/nagios/libexec/check_dummy 1 "There have been no recent passive updates"
}
Service definition:
Code: Select all
define service{
name passive-service ; The 'name' of this service template
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 ; We should obsess over this service (if necessary)
check_freshness 1 ; check service 'freshness'
freshness_threshold 600 ; complain if the data recieved is more than 10 mins old.
check_command no_recent_passive ; Report staleness
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 0 ; Flap detection is enabled .... disabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
notification_interval 0 ; Only send notifications on status change by default.
; is_volatile 0
check_period 24x7
check_interval 0
retry_check_interval 1
max_check_attempts 1
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
register 0 ; Don't register this template
}check_service_freshness=1
Is anyone able to point me in the right direction as to why the "check_command" is not called when a result turns stale?
Many thanks,
Sam