Regardless of what freshness_threshold I pick (as long as it's not too unrealistic),
I just want clarification if a bug exists? (By the way, where do you see the default
freshness threshold is 300 sec?). Anyway, I increased the threshold just now to 180
seconds and the only thing in my nagios.log was:
[1115831032] Finished daemonizing... (New PID=16154)
[1115831272] Warning: The results of service 'PROCS-NAGIOS' on host 'csstest2' are stale
by 60 seconds (threshold=180 seconds). I'm forcing an immediate check of the service.
So it did not even execute my eventhandler once? I'm getting very inconsistent results!
NRPE and check_by_ssh are not acceptable methods for distributed monitoring in our
environment.
Thanks for the comments... Justin
_________________________
Bryan Loniewski
Rutgers University
NBCS - Systems Programmer
On Wed, 11 May 2005, [email protected] wrote:
> Bryan, A freshness_threshold of 60 seconds might be a little
> unrealistic. The default value for the threshold is 300 seconds (5 minutes).
> If you want almost real-time stats, which appears to be what you're
> going for, perhaps you want to try NRPE or check_by_ssh as an alternative
> method of doing distributed monitoring.
>
> - Justin Kulikowski
> [ http://www.jpk236.com ]
>
> Bryan Loniewski wrote:
>> While trying to setup failover in a distributed environment, I came across
>> the following
>> problem (bug?) involving freshness checking.
>>
>> Note: The host that this is setup on is NOT receiving any passive checks
>> while I am
>> testing the freshness checking.. so the results are always stale forcing
>> the freshness
>> check everytime.
>>
>> Note2: Relevant config snippets are under my .sig
>>
>> Trying to configure (passive) service freshness checking to execute an
>> eventhandler
>> works correctly for 1 or 2 iterations.. BUT no more than that. It seems to
>> stop checking
>> the freshness after at most 3 iterations and stops executing the
>> eventhandler after at most 2 iterations. I've replicated this behavior
>> (too) many times and the results are
>> inconsistent.
>>
>> Below is the output of my nagios log:
>>
>>
>> [1115822708] Finished daemonizing... (New PID=15941)
>> [1115822828] Warning: The results of service 'PROCS-NAGIOS' on host
>> 'csstest2' are stale
>> by 60 seconds (threshold=60 seconds). I'm forcing an immediate check of
>> the service.
>> [1115822838] SERVICE ALERT: csstest2;PROCS-NAGIOS;CRITICAL;SOFT;1;CRITICAL
>> [1115822838] SERVICE EVENT HANDLER:
>> csstest2;PROCS-NAGIOS;CRITICAL;SOFT;1;slave-failover
>> [1115822948] Warning: The results of service 'PROCS-NAGIOS' on host
>> 'csstest2' are stale
>> by 60 seconds (threshold=60 seconds). I'm forcing an immediate check of
>> the service.
>>
>> Notice the freshness check ran ONLY 2 times when it should have run 5 (if
>> you look at my
>> config options below) and the eventhandler ran ONLY 1 time, when it should
>> have ran 3 times.
>>
>> Can anyone verify (disprove) this behavior? Am I missing something?
>>
>> _________________________
>> Bryan Loniewski
>> Rutgers University
>> NBCS - Systems Programmer
>>
>>
>> check_service_freshness=1
>> service_freshness_check_interval=60
>>
>>
>>
>> define service{
>> name generic-service
>> parallelize_check 1
>> obsess_over_service 1
>> check_freshness 0
>> freshness_threshold 60
>> notifications_enabled 1
>> event_handler_enabled 1
>> flap_detection_enabled 1
>> failure_prediction_enabled 1
>> process_perf_data 1
>> retain_status_information 1
>> retain_nonstatus_information 1
>> is_volatile 0
>> max_check_attempts 5
>> normal_check_interval 2
>> retry_check_interval 1
>> check_period 24x7
>>
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]