Hi!
I search the forum and google yet did not find answer. We have a nagios core 4.4.7 running on ubuntu 18.04.6 LTS. using 2 vCPU 4 GB RAM (Standard A2 v2 on azure). We have ~800 hosts with ~ 7000 service checks. We do not have performance issues (also monitored ). We have a new service check, which run approx. ~120 seconds. I can not change the code, it has long runtime.
The problem is we are getting "Service check timed out after 60.01 seconds". I know, the main config file (nagios.cfg) has a 60 sec timeout value. I wrote a custom command it's a check_nrpe_port command copy, with -t 200 parameter. I am using this, to call the script. The script is a powershell file, in a Windows server. I also modified the nsc.ini timeout values. Still getting check timeout. Is the global config is overwriting my command? Think so.
I tried to modify the global timeout to a higher value like 120. Almost immediately both CPU usage goes to 100% or near 100, and other check are running on timeout, so I wrote back to the original value.
Can't we have an exception to a service check timeout, other than the global config says?
If you need any more information please let me know.
Thanks
Peter