Page 1 of 2

Passive check not working giving stale warning

Posted: Thu Jan 04, 2018 4:55 am
by dpa.clt
I have configured nagios core server and installed nrdp in it .Also on client side installed ncpa.
We need to monitor the clients using passive check .
we were able to configure the nagios core,nrdp and ncpa .
But seems passive check is not working as expected and not giving actual results .
In nagios server can see all services in stale

Warning: The results of service 'Uptime' on host 'client01' are stale by 0d 0h 1m 0s (threshold=0d 0h 5m 0s). I'm forcing an im
mediate check of the service.

Also even though if client01 went down ,all services are shown in OK state

Please see the configuration as below

define host {
use generic-host
name passive_host
active_checks_enabled 0
passive_checks_enabled 1
flap_detection_enabled 1
register 0
check_period 24x7
max_check_attempts 5
check_interval 5
retry_interval 1
check_freshness 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
contact_groups admins
check_command check_dummy!0
notifications_enabled 1
notification_interval 60
notification_period 24x7
notification_options d,u,r
}
define service {
use generic-service
name passive_service
active_checks_enabled 0
passive_checks_enabled 1
flap_detection_enabled 1
register 0
check_period 24x7
max_check_attempts 5
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
check_interval 5
retry_interval 1
check_freshness 1
contact_groups admins
check_command check_dummy!0
notifications_enabled 1
notification_interval 60
notification_period 24x7
notification_options w,u,c,r
}

Please suggest to fix this

Re: Passive check not working giving stale warning

Posted: Thu Jan 04, 2018 12:58 pm
by npolovenko
@dpa.clt, If the age of the last check result is greater than the freshness threshold, the check result is considered "stale". I think that you need to change check_freshness 1 to check_freshness 0 in service and host definitions. Since Nagios server no longer has control over when it receives the passive check results, it would make sense to disable this option. You could also adjust the sleep interval in ncpa.cfg file on the remote server to something less than 5 minutes, or change the check_interval 5 to check_interval 10 inside the service definiton.(if you want to keep the freshness check)

Re: Passive check not working giving stale warning

Posted: Fri Jan 05, 2018 3:06 am
by dpa.clt
@npolovenko even after performing the recommended changes .Passive check is not working .Still if VM goes down all services are and host are shown as UP

Any other suggestions please

Re: Passive check not working giving stale warning

Posted: Fri Jan 05, 2018 12:32 pm
by npolovenko
@ dpa.clt, If the VM goes down then there's no way for it send passive checks to the Nagios server. It's all about setting the right intervals, or you may perform a ping check and if that fails have Nagios turn all the corresponding checks critical. I would like to see if the passive check results are reaching nagios server. Can you take a couple screenshots of "Service State" pages for the passive checks? I want to see whether the "last update" time is changing or not.

Re: Passive check not working giving stale warning

Posted: Mon Jan 08, 2018 12:34 am
by dpa.clt
Thanks for the reply .
Last check time is changing ,but last state change time is not changing irrespective of service status ,even if service or host goes down all service status is shown up .Attaching screen shots .Please check

Is this because we are explicitly giving checkdummy !0 which means its OK state

Kindly suggest

Re: Passive check not working giving stale warning

Posted: Mon Jan 08, 2018 1:55 pm
by npolovenko
@dpa.clt, I'm almost sure that Check Type should say PASSIVE, and it says ACTIVE in your case. Something is not right. Can you show me the config file where you defined the host and services? Not the templates, but the definitions, they probably look like this:

Code: Select all

define service {
    use                    passive_service
    service_description    CPU Usage
    host_name                Client01
}

Re: Passive check not working giving stale warning

Posted: Mon Jan 08, 2018 11:28 pm
by dpa.clt
Please see the configuration

Code: Select all

passive_host

define host {
use generic-host
name passive_host
active_checks_enabled 0
passive_checks_enabled 1
flap_detection_enabled 1
register 0
check_period 24x7
max_check_attempts 5
check_interval 5
retry_interval 1
check_freshness 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
contact_groups admins
check_command check_dummy!0
notifications_enabled 1
notification_interval 60
notification_period 24x7
notification_options d,u,r
}

passive_service

define service {
use generic-service
name passive_service
active_checks_enabled 0
passive_checks_enabled 1
flap_detection_enabled 1
register 0
check_period 24x7
max_check_attempts 5
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
check_interval 5
retry_interval 1
check_freshness 1
contact_groups admins
check_command check_dummy!0
notifications_enabled 1
notification_interval 60
notification_period 24x7
notification_options w,u,c,r
}

This is snippet from command.cfg file

# Passive Check
define command {
    command_name            check_dummy
    command_line            $USER1$/check_dummy $ARG1$
}


host configuration

define host {
    use            passive_host
    icon_image     ubuntu_logo.png
    host_name      client01
}

host service 


###client01####
#Up Time
define service {
    use                    passive_service
    service_description    Uptime
    freshness_threshold    300
    host_name              client01
}
#CPU Usage
define service {
    use                    passive_service
    service_description    CPU Usage
    freshness_threshold    300
    host_name              client01
}
#Disk Usage
define service {
    use                    passive_service
    service_description    Root Disk Usage
    host_name              client01
}
#Memory Usage
define service {
    use                    passive_service
    service_description    Memory Usage
    host_name              client01
}
#Process Count
define service {
    use                    passive_service
    service_description    Process Count
    host_name              client01
}
#Swap
define service {
    use                    passive_service
    service_description    Swap Usage
    host_name              client01
}
#Services
define service {
    use                    passive_service
    service_description    ncpalistener Service
    host_name              client01
}

define service {
    use                    passive_service
    service_description    ncpapassive Service
    host_name              client01
}

Re: Passive check not working giving stale warning

Posted: Tue Jan 09, 2018 12:08 am
by dpa.clt
Also receiving below message in nagios.log file

Code: Select all

[1515474415] Warning: The results of host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s).  I'm forcing an immediate check
of the host.
[1515474425] Warning: The results of service 'Memory Usage' on host 'client01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s).  I'm forcin
g an immediate check of the service.
[1515474425] Warning: The results of service 'Process Count' on host 'client01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s).  I'm forci
ng an immediate check of the service.
[1515474425] Warning: The results of service 'Root Disk Usage' on host 'client01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s).  I'm for
cing an immediate check of the service.
[1515474425] Warning: The results of service 'Swap Usage' on host 'client01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s).  I'm forcing
an immediate check of the service.
[1515474425] Warning: The results of service 'ncpalistener Service' on host 'client01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s).  I'
m forcing an immediate check of the service.
[1515474425] Warning: The results of service 'ncpapassive Service' on host 'client01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s).  I'm
 forcing an immediate check of the service.
[1515474425] Warning: The results of service 'CPU Usage' on host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s).  I'm forc
ing an immediate check of the service.
[1515474425] Warning: The results of service 'Memory Usage' on host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s).  I'm f
orcing an immediate check of the service.
[1515474425] Warning: The results of service 'Process Count' on host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s).  I'm
forcing an immediate check of the service.
[1515474425] Warning: The results of service 'Root Disk Usage' on host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s).  I'
m forcing an immediate check of the service.
[1515474425] Warning: The results of service 'Swap Usage' on host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s).  I'm for
cing an immediate check of the service.
[1515474425] Warning: The results of service 'Uptime' on host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s).  I'm forcing
 an immediate check of the service.
[1515474425] Warning: The results of service 'ncpalistener Service' on host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s)
.  I'm forcing an immediate check of the service.
[1515474425] Warning: The results of service 'ncpapassive Service' on host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s).
  I'm forcing an immediate check of the service.

Re: Passive check not working giving stale warning

Posted: Tue Jan 09, 2018 12:10 pm
by npolovenko
@dpa.clt, So what the freshness check does in your case is it waits for its interval and if the passive result reaches Nagios during that period(5 min in your case) it uses it, but if no passive result was received it performs an "active" check to the check_dummy plugin. So, what I'd do first is change the check_dummy command in host and service definition to:
check_command check_dummy!2. That means whenever no passive result reaches Nagios within the freshness interval it will return the state 2=Critical, instead of 0=OK.

Now, the question is why passive check results don't reach nagios within the check_interval = 5 minutes. Is that because they come a few seconds late? Then i'd increase all check_interval values to 10 maybe. Otherwise, I'd start looking at NCPA configurations on the remote server. Perhaps the ncpa_passive.log could give some more information. C:\Program Files (x86)\Nagios\NCPA\var\log

Re: Passive check not working giving stale warning

Posted: Tue Jan 09, 2018 5:29 pm
by Box293
Can you also provide your NCPA config files, they are:

Windows
C:\Program Files (x86)\Nagios\NCPA\etc\ncpa.cfg
Any file in C:\Program Files (x86)\Nagios\NCPA\etc\ncpa.cfg.d\

All Other
/usr/local/ncpa/etc/ncpa.cfg
Any file in /usr/local/ncpa/etc/ncpa.cfg.d/