Host check times out but plugin from command line returns OK

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
amb_gopai
Posts: 1
Joined: Mon Jun 28, 2021 2:29 pm

Host check times out but plugin from command line returns OK

Post by amb_gopai »

My issue appears identical to that described here:

https://support.nagios.com/forum/viewto ... =7&t=56089

But that topic is locked and does not appear to have resolution.

Software version: Nagios core 4.4.6 (April 28, 2020)
Running on CentOS Linux release 7.9.2009
Nagios plugins EPEL7 v. 2.3.3-2.el7

The relevant entry from the Nagios log file:
[1624909376] wproc: Core Worker 18130: job 51 (pid=27688) timed out. Killing it
[1624909376] wproc: CHECK job 51 from worker Core Worker 18130 timed out after 30.01s
[1624909376] wproc: host=ns2.gopai.com; service=(null);
[1624909376] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1624909376] Warning: Check of host 'ns2.gopai.com' timed out after 30.01 seconds
[1624909376] wproc: Core Worker 18130: job 51 (pid=27688): Dormant child reaped
Relevant command config:
define command {
command_name check-host-alive-dyn
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p $ARG3$
}
Same command run from command line while active checks are failing:
[root@nagios log]# for ((i=0; i<10; i++)); do echo "trial $i"; time /usr/lib64/nagios/plugins/check_ping ns2.gopai.com -w 8000.0,80% -c 15000.0,100% -p 10; done
trial 0
PING OK - Packet loss = 0%, RTA = 66.47 ms|rta=66.473999ms;8000.000000;15000.000000;0.000000 pl=0%;80;100;0

real 0m9.141s
user 0m0.004s
sys 0m0.009s
trial 1
PING OK - Packet loss = 0%, RTA = 66.56 ms|rta=66.559998ms;8000.000000;15000.000000;0.000000 pl=0%;80;100;0

real 0m9.089s
user 0m0.004s
sys 0m0.008s
trial 2
PING OK - Packet loss = 0%, RTA = 66.44 ms|rta=66.442001ms;8000.000000;15000.000000;0.000000 pl=0%;80;100;0

real 0m9.092s
user 0m0.001s
sys 0m0.012s
trial 3
PING OK - Packet loss = 0%, RTA = 66.84 ms|rta=66.837997ms;8000.000000;15000.000000;0.000000 pl=0%;80;100;0

real 0m9.093s
user 0m0.002s
sys 0m0.010s
trial 4
PING OK - Packet loss = 0%, RTA = 66.59 ms|rta=66.592003ms;8000.000000;15000.000000;0.000000 pl=0%;80;100;0

real 0m9.092s
user 0m0.006s
sys 0m0.007s
trial 5
PING OK - Packet loss = 0%, RTA = 66.74 ms|rta=66.738998ms;8000.000000;15000.000000;0.000000 pl=0%;80;100;0

real 0m9.092s
user 0m0.005s
sys 0m0.008s
trial 6
PING OK - Packet loss = 0%, RTA = 68.44 ms|rta=68.438004ms;8000.000000;15000.000000;0.000000 pl=0%;80;100;0

real 0m9.091s
user 0m0.005s
sys 0m0.007s
trial 7
PING OK - Packet loss = 0%, RTA = 66.61 ms|rta=66.605003ms;8000.000000;15000.000000;0.000000 pl=0%;80;100;0

real 0m9.083s
user 0m0.006s
sys 0m0.007s
trial 8
PING OK - Packet loss = 0%, RTA = 66.43 ms|rta=66.428001ms;8000.000000;15000.000000;0.000000 pl=0%;80;100;0

real 0m9.094s
user 0m0.003s
sys 0m0.011s
trial 9
PING OK - Packet loss = 0%, RTA = 67.17 ms|rta=67.170998ms;8000.000000;15000.000000;0.000000 pl=0%;80;100;0

real 0m9.093s
user 0m0.006s
sys 0m0.007s
[root@nagios log]#
I'm not sure what the disconnect here is but I've never managed to catch this check from the command line taking more than 10 seconds.

I'm starting a new thread because the relevant one I found was locked and was hoping someone had some suggestions to troubleshoot this further?
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Host check times out but plugin from command line return

Post by pbroste »

Hello;
Thanks for following up and reaching out further on this issue.

It appears that you have done quite a bit of research and troubleshooting up to this point with no clear path to resolution.

I would suggest that may need to take a step back and look in a different direction. Several suggestions as we advance:
  • Look at network statistics using a "check network plugin" and view the real-time stats on the network adapter to get details.
  • Second option; to capture pcap on the 'device/server' using tcpdump and putting in place filters to capture only what is necessary.
  • Option to capture the plugin results by adding the capture_output.pl plugin.
  • [code]capture_output.pl /usr/local/nagios/libexec/check_ping [youripaddress] -w 8000.0,80% -c 15000.0,100% -p 10[/code]

Code: Select all

Example; 

define command {
command_name check-host-alive-dyn
command_line $USER1$/capture_output.pl /usr/local/nagios/libexec/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p $ARG3$
}
The output on capture_output.pl is located in the /tmp/ directory.

Thanks,
Perry
Locked