We have an issue, where once a host goes into a soft down state, the retry check runs quicker than the host setting. Our retry interval is set for 1 minute on all hosts. but when they go into a soft down state, the check retries anywhere from 29-39 seconds, when it should retry at 1 minute.
Here is a sampling of the soft down checks.
2015-11-24 18:53:28 hqs1v-dl3com02 DOWN SOFT 4 of 5 CRITICAL - : rta nan, lost 100%
2015-11-24 18:53:00 hqs1v-dl3com02 DOWN SOFT 3 of 5 CRITICAL - : rta nan, lost 100%
2015-11-24 18:52:15 hqs1v-dl3com02 DOWN SOFT 2 of 5 CRITICAL - : rta nan, lost 100%
2015-11-24 18:51:10 hqs1v-dl3com02 DOWN SOFT 1 of 5 CRITICAL - : rta nan, lost 100%
2015-11-24 02:54:37 achdpapp02 DOWN SOFT 5 of 10 CRITICAL - : rta nan, lost 100%
2015-11-24 02:54:09 achdpapp02 DOWN SOFT 4 of 10 CRITICAL - : rta nan, lost 100%
2015-11-24 02:53:50 achdpapp02 DOWN SOFT 3 of 10 CRITICAL - : rta nan, lost 100%
2015-11-24 02:53:11 achdpapp02 DOWN SOFT 2 of 10 CRITICAL - : rta nan, lost 100%
2015-11-24 02:52:50 achdpapp02 DOWN SOFT 1 of 10 CRITICAL - : rta nan, lost 100%
Nagios Environment
Linux Distribution and version - CentOS release 6.6 (Final)
32 or 64bit - 64bit
VMware Image or Manual Install of XI? - VMWare Image
Are there special configurations on your system, ie; is Gnome installed - Nope
Are you using a proxy - Nope
Are you using SSL - Nope
check_host_freshness=0
host_freshness_check_interval=60
host_inter_check_delay_method=s
max_host_check_spread=30
This is a similar issue to: https://support.nagios.com/forum/viewto ... =7&t=22249 but we never saw a resolution.
Thanks.
Host Rechecks faster than retry setting
Re: Host Rechecks faster than retry setting
This could be a bug, seeing as a solution wasn't found with the past thread. I'd like to gather a bit more information about your system though -
How many hosts / service checks are running on your machine? How many CPUs are allocated to it? What's the load like on it?
As well, please post the output of the following-
Navigate to Admin -> Monitoring Engine Status, and Admin -> System Status - post a screenshot of both pages for us to take a look at.
How many hosts / service checks are running on your machine? How many CPUs are allocated to it? What's the load like on it?
As well, please post the output of the following-
Code: Select all
top|head -5
Former Nagios Employee
Re: Host Rechecks faster than retry setting
Attached are the screenshots for top, Monitoring Engine status and System Status.
Let me know if you need any further information.
Let me know if you need any further information.
You do not have the required permissions to view the files attached to this post.
Re: Host Rechecks faster than retry setting
Nothing looks too crazy above. Can you post a complete definition for one of the hosts as well?
Former Nagios Employee
Re: Host Rechecks faster than retry setting
Here is the host config's for one of the Hosts. (Nothing out of the ordinary here.)
You do not have the required permissions to view the files attached to this post.
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Host Rechecks faster than retry setting
Can you please post the config for the host template 24x7-linux-server_event including other definitions used by this template such as time periods.
Also, can you find this host object in these files and post the details here please:
/usr/local/nagios/var/objects.cache
/usr/local/nagios/var/retention.dat
If you have a ramdisk implemented the objects.cache may not be in this location and you'll need to consult /usr/local/nagios/etc/nagios.cfg for it's location.
Also, can you find this host object in these files and post the details here please:
/usr/local/nagios/var/objects.cache
/usr/local/nagios/var/retention.dat
If you have a ramdisk implemented the objects.cache may not be in this location and you'll need to consult /usr/local/nagios/etc/nagios.cfg for it's location.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Host Rechecks faster than retry setting
Attached is the server template, retention and cache information.
You do not have the required permissions to view the files attached to this post.
Re: Host Rechecks faster than retry setting
Rentention.cache
You do not have the required permissions to view the files attached to this post.
Re: Host Rechecks faster than retry setting
Do you have parent/child relationships set up, or predictive checks enabled?
https://assets.nagios.com/downloads/nag ... ncies.html
https://assets.nagios.com/downloads/nag ... ility.html
https://assets.nagios.com/downloads/nag ... hecks.html
Either of those might be causing things to be checked more often than they are configured to.
https://assets.nagios.com/downloads/nag ... ncies.html
https://assets.nagios.com/downloads/nag ... ility.html
https://assets.nagios.com/downloads/nag ... hecks.html
Either of those might be causing things to be checked more often than they are configured to.
Former Nagios employee
Re: Host Rechecks faster than retry setting
We do have Service Dependency checks setup. I'll remove the dependency checks and post the results after the next issue.