Passive check not working giving stale warning
Passive check not working giving stale warning
I have configured nagios core server and installed nrdp in it .Also on client side installed ncpa.
We need to monitor the clients using passive check .
we were able to configure the nagios core,nrdp and ncpa .
But seems passive check is not working as expected and not giving actual results .
In nagios server can see all services in stale
Warning: The results of service 'Uptime' on host 'client01' are stale by 0d 0h 1m 0s (threshold=0d 0h 5m 0s). I'm forcing an im
mediate check of the service.
Also even though if client01 went down ,all services are shown in OK state
Please see the configuration as below
define host {
use generic-host
name passive_host
active_checks_enabled 0
passive_checks_enabled 1
flap_detection_enabled 1
register 0
check_period 24x7
max_check_attempts 5
check_interval 5
retry_interval 1
check_freshness 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
contact_groups admins
check_command check_dummy!0
notifications_enabled 1
notification_interval 60
notification_period 24x7
notification_options d,u,r
}
define service {
use generic-service
name passive_service
active_checks_enabled 0
passive_checks_enabled 1
flap_detection_enabled 1
register 0
check_period 24x7
max_check_attempts 5
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
check_interval 5
retry_interval 1
check_freshness 1
contact_groups admins
check_command check_dummy!0
notifications_enabled 1
notification_interval 60
notification_period 24x7
notification_options w,u,c,r
}
Please suggest to fix this
We need to monitor the clients using passive check .
we were able to configure the nagios core,nrdp and ncpa .
But seems passive check is not working as expected and not giving actual results .
In nagios server can see all services in stale
Warning: The results of service 'Uptime' on host 'client01' are stale by 0d 0h 1m 0s (threshold=0d 0h 5m 0s). I'm forcing an im
mediate check of the service.
Also even though if client01 went down ,all services are shown in OK state
Please see the configuration as below
define host {
use generic-host
name passive_host
active_checks_enabled 0
passive_checks_enabled 1
flap_detection_enabled 1
register 0
check_period 24x7
max_check_attempts 5
check_interval 5
retry_interval 1
check_freshness 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
contact_groups admins
check_command check_dummy!0
notifications_enabled 1
notification_interval 60
notification_period 24x7
notification_options d,u,r
}
define service {
use generic-service
name passive_service
active_checks_enabled 0
passive_checks_enabled 1
flap_detection_enabled 1
register 0
check_period 24x7
max_check_attempts 5
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
check_interval 5
retry_interval 1
check_freshness 1
contact_groups admins
check_command check_dummy!0
notifications_enabled 1
notification_interval 60
notification_period 24x7
notification_options w,u,c,r
}
Please suggest to fix this
-
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Passive check not working giving stale warning
@dpa.clt, If the age of the last check result is greater than the freshness threshold, the check result is considered "stale". I think that you need to change check_freshness 1 to check_freshness 0 in service and host definitions. Since Nagios server no longer has control over when it receives the passive check results, it would make sense to disable this option. You could also adjust the sleep interval in ncpa.cfg file on the remote server to something less than 5 minutes, or change the check_interval 5 to check_interval 10 inside the service definiton.(if you want to keep the freshness check)
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Passive check not working giving stale warning
@npolovenko even after performing the recommended changes .Passive check is not working .Still if VM goes down all services are and host are shown as UP
Any other suggestions please
Any other suggestions please
-
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Passive check not working giving stale warning
@ dpa.clt, If the VM goes down then there's no way for it send passive checks to the Nagios server. It's all about setting the right intervals, or you may perform a ping check and if that fails have Nagios turn all the corresponding checks critical. I would like to see if the passive check results are reaching nagios server. Can you take a couple screenshots of "Service State" pages for the passive checks? I want to see whether the "last update" time is changing or not.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Passive check not working giving stale warning
Thanks for the reply .
Last check time is changing ,but last state change time is not changing irrespective of service status ,even if service or host goes down all service status is shown up .Attaching screen shots .Please check
Is this because we are explicitly giving checkdummy !0 which means its OK state
Kindly suggest
Last check time is changing ,but last state change time is not changing irrespective of service status ,even if service or host goes down all service status is shown up .Attaching screen shots .Please check
Is this because we are explicitly giving checkdummy !0 which means its OK state
Kindly suggest
-
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Passive check not working giving stale warning
@dpa.clt, I'm almost sure that Check Type should say PASSIVE, and it says ACTIVE in your case. Something is not right. Can you show me the config file where you defined the host and services? Not the templates, but the definitions, they probably look like this:
Code: Select all
define service {
use passive_service
service_description CPU Usage
host_name Client01
}
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Passive check not working giving stale warning
Please see the configuration
Code: Select all
passive_host
define host {
use generic-host
name passive_host
active_checks_enabled 0
passive_checks_enabled 1
flap_detection_enabled 1
register 0
check_period 24x7
max_check_attempts 5
check_interval 5
retry_interval 1
check_freshness 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
contact_groups admins
check_command check_dummy!0
notifications_enabled 1
notification_interval 60
notification_period 24x7
notification_options d,u,r
}
passive_service
define service {
use generic-service
name passive_service
active_checks_enabled 0
passive_checks_enabled 1
flap_detection_enabled 1
register 0
check_period 24x7
max_check_attempts 5
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
check_interval 5
retry_interval 1
check_freshness 1
contact_groups admins
check_command check_dummy!0
notifications_enabled 1
notification_interval 60
notification_period 24x7
notification_options w,u,c,r
}
This is snippet from command.cfg file
# Passive Check
define command {
command_name check_dummy
command_line $USER1$/check_dummy $ARG1$
}
host configuration
define host {
use passive_host
icon_image ubuntu_logo.png
host_name client01
}
host service
###client01####
#Up Time
define service {
use passive_service
service_description Uptime
freshness_threshold 300
host_name client01
}
#CPU Usage
define service {
use passive_service
service_description CPU Usage
freshness_threshold 300
host_name client01
}
#Disk Usage
define service {
use passive_service
service_description Root Disk Usage
host_name client01
}
#Memory Usage
define service {
use passive_service
service_description Memory Usage
host_name client01
}
#Process Count
define service {
use passive_service
service_description Process Count
host_name client01
}
#Swap
define service {
use passive_service
service_description Swap Usage
host_name client01
}
#Services
define service {
use passive_service
service_description ncpalistener Service
host_name client01
}
define service {
use passive_service
service_description ncpapassive Service
host_name client01
}
Re: Passive check not working giving stale warning
Also receiving below message in nagios.log file
Code: Select all
[1515474415] Warning: The results of host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s). I'm forcing an immediate check
of the host.
[1515474425] Warning: The results of service 'Memory Usage' on host 'client01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s). I'm forcin
g an immediate check of the service.
[1515474425] Warning: The results of service 'Process Count' on host 'client01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s). I'm forci
ng an immediate check of the service.
[1515474425] Warning: The results of service 'Root Disk Usage' on host 'client01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s). I'm for
cing an immediate check of the service.
[1515474425] Warning: The results of service 'Swap Usage' on host 'client01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s). I'm forcing
an immediate check of the service.
[1515474425] Warning: The results of service 'ncpalistener Service' on host 'client01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s). I'
m forcing an immediate check of the service.
[1515474425] Warning: The results of service 'ncpapassive Service' on host 'client01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s). I'm
forcing an immediate check of the service.
[1515474425] Warning: The results of service 'CPU Usage' on host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s). I'm forc
ing an immediate check of the service.
[1515474425] Warning: The results of service 'Memory Usage' on host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s). I'm f
orcing an immediate check of the service.
[1515474425] Warning: The results of service 'Process Count' on host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s). I'm
forcing an immediate check of the service.
[1515474425] Warning: The results of service 'Root Disk Usage' on host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s). I'
m forcing an immediate check of the service.
[1515474425] Warning: The results of service 'Swap Usage' on host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s). I'm for
cing an immediate check of the service.
[1515474425] Warning: The results of service 'Uptime' on host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s). I'm forcing
an immediate check of the service.
[1515474425] Warning: The results of service 'ncpalistener Service' on host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s)
. I'm forcing an immediate check of the service.
[1515474425] Warning: The results of service 'ncpapassive Service' on host 'fgsmipurepo01' are stale by 0d 0h 0m 5s (threshold=0d 0h 1m 15s).
I'm forcing an immediate check of the service.
-
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Passive check not working giving stale warning
@dpa.clt, So what the freshness check does in your case is it waits for its interval and if the passive result reaches Nagios during that period(5 min in your case) it uses it, but if no passive result was received it performs an "active" check to the check_dummy plugin. So, what I'd do first is change the check_dummy command in host and service definition to:
check_command check_dummy!2. That means whenever no passive result reaches Nagios within the freshness interval it will return the state 2=Critical, instead of 0=OK.
Now, the question is why passive check results don't reach nagios within the check_interval = 5 minutes. Is that because they come a few seconds late? Then i'd increase all check_interval values to 10 maybe. Otherwise, I'd start looking at NCPA configurations on the remote server. Perhaps the ncpa_passive.log could give some more information. C:\Program Files (x86)\Nagios\NCPA\var\log
check_command check_dummy!2. That means whenever no passive result reaches Nagios within the freshness interval it will return the state 2=Critical, instead of 0=OK.
Now, the question is why passive check results don't reach nagios within the check_interval = 5 minutes. Is that because they come a few seconds late? Then i'd increase all check_interval values to 10 maybe. Otherwise, I'd start looking at NCPA configurations on the remote server. Perhaps the ncpa_passive.log could give some more information. C:\Program Files (x86)\Nagios\NCPA\var\log
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Passive check not working giving stale warning
Can you also provide your NCPA config files, they are:
Windows
C:\Program Files (x86)\Nagios\NCPA\etc\ncpa.cfg
Any file in C:\Program Files (x86)\Nagios\NCPA\etc\ncpa.cfg.d\
All Other
/usr/local/ncpa/etc/ncpa.cfg
Any file in /usr/local/ncpa/etc/ncpa.cfg.d/
Windows
C:\Program Files (x86)\Nagios\NCPA\etc\ncpa.cfg
Any file in C:\Program Files (x86)\Nagios\NCPA\etc\ncpa.cfg.d\
All Other
/usr/local/ncpa/etc/ncpa.cfg
Any file in /usr/local/ncpa/etc/ncpa.cfg.d/
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.