Page 1 of 1

Staleness not reported

Posted: Wed Feb 16, 2011 5:32 am
by pelew
Hi all,
I'm montiring a number of sites with nagios NSCA. The reports all work OK, but if I shut down the nsclient on a remote windows server I do not get staleness warnings even though last check shows more than 10 minutes ago (600 seconds).
I've set active_checks to 0 and check_freshness to 1, and also defined a check command (check_dummy 2 "Host is stale") that works when I use active checks.
It seems that the staleness is not being detected or that if it does then the check_command is not executed.

Is there a way to debug this behaviour?

Re: Staleness not reported

Posted: Wed Feb 16, 2011 11:01 am
by mguthrie
Check your main nagios.cfg file. You may have check_freshness globally disabled. You also need to set a "freshness_threshold" in minutes.

Re: Staleness not reported

Posted: Thu Apr 21, 2011 9:20 am
by mitchsmith
Hi,

I have just resolved a similar issue. Check the settings in your service definition/template. When configuring NSCA documentation states to set:

Code: Select all

check_period  none
This prevents the central nagios server from checking the freshness state, and executing the check command.

Set the check_period to utalise the same timeperiod as the distributed nagios server.

Example Remote Server

Code: Select all

define service{
        name                            distributed-service         ; The 'name' of this service template
        active_checks_enabled           1                       ; Active service checks are enabled
        passive_checks_enabled          1                       ; Passive service checks are enabled/accepted

        check_period                    24x7                    ; The service can be checked at any time of the day
        max_check_attempts              3                       ; Re-check the service up to 3 times in order to determine its final (hard) state
        normal_check_interval           20                      ; Check the service every 10 minutes under normal conditions
        retry_check_interval            4 
        obsess_over_service             1                       ; We should obsess over this service (if necessary)
......
Example Central Server

Code: Select all

define service{
        name                            distributed-service         ; The 'name' of this service template
        active_checks_enabled           0                       ; Active service checks are enabled
        passive_checks_enabled          1                       ; Passive service checks are enabled/accepted

        check_period                    24x7                    ; The service can be checked at any time of the day
        max_check_attempts              3                       ; Re-check the service up to 3 times in order to determine its final (hard) state
        normal_check_interval           20                      ; Check the service every 10 minutes under normal conditions
        retry_check_interval            4                       ; Re-check the service every two minutes until a hard state can be determined
      
         check_freshness                 1                      ; Default is to NOT check service 'freshness'
        freshness_threshold             1230                    ; After threshold 1230seconds (20mins:30sec)the Central nagios will check activly the service state`
......

Re: Staleness not reported

Posted: Thu Apr 21, 2011 2:54 pm
by rdedon
Thanks for the contribution mitchsmith, great write up! :-)