XI 5.8.1
RHEL 7.7
Hi all,
The check scheduler on my prod XI server regularly clumps large amounts of checks out to the end of the range while leaving some portions without checks at all, see the attached screenshot. This happens on its own and sometimes smooths out, but often ends up like this again. Is this normal or an indication of a potential issue somewhere?
Check Scheduler oddness
Check Scheduler oddness
You do not have the required permissions to view the files attached to this post.
--
Griffin Wakem
Griffin Wakem
Re: Check Scheduler oddness
Hello Griffin,
Thanks for reaching out, there is a number of reasons that this could be happening. Want to take a look at the System Profile so we can see what is going on.
To send us your system profile.
Perry
Thanks for reaching out, there is a number of reasons that this could be happening. Want to take a look at the System Profile so we can see what is going on.
To send us your system profile.
- Login to the Nagios XI GUI using a web browser.
- Click the "Admin" > "System Profile" Menu
- Click the "Download Profile" button
- Save the profile.zip file and send via Private Message
Perry
Re: Check Scheduler oddness
Hello @gwakem
Thanks for sending over the System Profile, reviewing we see that your 'check-host-alive' checks are timing out.
Let us know how things are looking,
Perry
Thanks for sending over the System Profile, reviewing we see that your 'check-host-alive' checks are timing out.
Want to increase the check timeouts:wproc: Core Worker 9844: job 129303 (pid=32402) timed out. Killing it
CHECK job 129303 from worker Core Worker 9844 timed out after 30.00s
Warning: Check of host 'mxxxxxxx-pxxxxx' timed out after 30.00 seconds
wproc: host=monitoring-pi00494; service=(null);
early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
Warning: Check of host 'mxxxxxxxx-pxxxxxxx' timed out after 30.01 seconds
check_icmp [options] [-H] host1 host2 hostN
Options:
-t
timeout value (seconds, currently 10)
check_ping -H <host_address> -w <wrta>,<wpl>% -c <crta>,<cpl>%
[-p packets] [-t timeout] [-4|-6]
Options:
-t, --timeout=INTEGER:<timeout state>
Seconds before connection times out (default: 10)
Optional ":<timeout state>" can be a state integer (0,1,2,3) or a state STRING
Code: Select all
vi /usr/local/nagios/etc/nagios.cfgService Check Timeout
Format: service_check_timeout=<seconds>
Example: service_check_timeout=60
Host Check Timeout
Format: host_check_timeout=<seconds>
Example: host_check_timeout=60
Code: Select all
vi /usr/local/ncpa/etc/ncpa.cfgBump the timeouts up by 60 seconds and then check to see how things look. Restart the ncpa_listener.service and nagios.service.host_check_timeout=30
service_check_timeout=60
Let us know how things are looking,
Perry