Page 1 of 2
Host check - CRITICAL - Socket timeout
Posted: Wed Dec 12, 2018 3:48 pm
by Sampath.Basireddy
Host check is throwing "CRITICAL - Socket timeout" even after host_check_timeout is increased to 45sec.
The host is taking about 32ms when pinged, but still getting Socket timeout error.
This is an URL Monitoring which created a host. Is there any other timeout parameter that needs to be updated in this regards?
Code: Select all
PING 10.10.40.50 (10.10.40.50) 56(84) bytes of data.
64 bytes from 10.10.40.50: icmp_seq=1 ttl=246 time=32.8 ms
64 bytes from 10.10.40.50: icmp_seq=2 ttl=246 time=32.7 ms
64 bytes from 10.10.40.50: icmp_seq=3 ttl=246 time=33.0 ms
64 bytes from 10.10.40.50: icmp_seq=4 ttl=246 time=32.6 ms
64 bytes from 10.10.40.50: icmp_seq=5 ttl=246 time=32.6 ms
--- 10.40.21.145 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4005ms
rtt min/avg/max/mdev = 32.635/32.793/33.067/0.253 ms
Re: Host check - CRITICAL - Socket timeout
Posted: Thu Dec 13, 2018 1:03 pm
by benjaminsmith
Hi
@Sampath.Basireddy,
There are global timeout settings in the main configuration (nagios.cfg) for host and service check commands. Try increasing those values and let me know if that resolve the issue. Go to Configure > Core Config Manager > CCM Admin > Core Configs and this brings up the Core Configs page and by default the General [nagios.cfg] tab is selected, and look for:
host_check_timeout=30
service_check_timeout=60
Increase those values, save changes and then apply configuration. If that doesn't work, can you post the check command?
Nagios XI - How To Test Check Commands From The Command-line
https://support.nagios.com/kb/article.php?id=167
Re: Host check - CRITICAL - Socket timeout
Posted: Fri Dec 14, 2018 11:31 am
by Sampath.Basireddy
@benjaminsmith, Yes, in fact I increased host_check_timeout to 45 sec when I noticed host ping response is greater than 30sec, but no luck.
This is just a ping check. The URL Content monitoring on this host though works fine. It is just the host check that is timing out.
Any thoughts as to what else we can check here?
Re: Host check - CRITICAL - Socket timeout
Posted: Fri Dec 14, 2018 12:52 pm
by benjaminsmith
Hi
@Sampath.Basireddy,
Can you log into the Nagios Server an run the check from the command line. If your using check_icmp, add the -v option for verbose output.
Code: Select all
cd /usr/local/nagios/libexec
su nagios
./check_icmp -H <ip address> -v
# or
./check_ping -H <ip address> -w 10,2% -c 20,5%
One reason why the URL content monitoring is working but the host check is timing out could be the host server/system firewall could be blocking ICMP.
Nagios XI - How To Test Check Commands From The Command-line
https://support.nagios.com/kb/article.php?id=167
Re: Host check - CRITICAL - Socket timeout
Posted: Fri Dec 14, 2018 4:13 pm
by Sampath.Basireddy
Here is the output from both commands:
Code: Select all
nagios@nagiossrv1:[/usr/local/nagios/libexec]: ./check_icmp -H 10.40.21.145 -v
ttl set to 64
Setting alarm timeout to 10 seconds
packets: 5, targets: 1
target_interval: 0.000, pkt_interval 80.000
crit.rta: 500.000
max_completion_time: 3400.000
crit = {500000, 80%}, warn = {200000, 40%}
pkt_interval: 80000 target_interval: 0 retry_interval: 0
icmp_pkt_size: 76 timeout: 10
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
33.226 ms rtt from 10.40.21.145, outgoing ttl: 64, incoming ttl: 246, max: 33.226, min: 33.226
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
32.742 ms rtt from 10.40.21.145, outgoing ttl: 64, incoming ttl: 246, max: 33.226, min: 32.742
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
33.308 ms rtt from 10.40.21.145, outgoing ttl: 64, incoming ttl: 246, max: 33.308, min: 32.742
34.087 ms rtt from 10.40.21.145, outgoing ttl: 64, incoming ttl: 246, max: 34.087, min: 32.742
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
handle_random_icmp(0x7ffd69aab290, 0x7ffd69aab120)
33.123 ms rtt from 10.40.21.145, outgoing ttl: 64, incoming ttl: 246, max: 34.087, min: 32.742
icmp_sent: 5 icmp_recv: 5 icmp_lost: 0
targets: 1 targets_alive: 1
OK - 10.40.21.145: rta 33.297ms, lost 0%|rta=33.297ms;200.000;500.000;0; pl=0%;40;80;; rtmax=34.087ms;;;; rtmin=32.742ms;;;;
targets: 1, targets_alive: 1, hosts_ok: 1, hosts_warn: 0, min_hosts_alive: -1
Code: Select all
nagios@nagiossrv1:[/usr/local/nagios/libexec]: ./check_icmp -H 10.40.21.145 -w 10,2% -c 20,5%
CRITICAL - 10.40.21.145: rta 32.897ms, lost 0%|rta=32.897ms;10.000;20.000;0; pl=0%;2;5;; rtmax=33.179ms;;;; rtmin=32.707ms;;;;
Code: Select all
nagios@nagiossrv1:[/usr/local/nagios/libexec]: ./check_ping -H 10.40.21.145 -w 10,2% -c 20,5%
PING CRITICAL - Packet loss = 0%, RTA = 33.16 ms|rta=33.162998ms;10.000000;20.000000;0.000000 pl=0%;2;5;0
Re: Host check - CRITICAL - Socket timeout
Posted: Fri Dec 14, 2018 4:33 pm
by benjaminsmith
Hi
@Sampath.Basireddy
Ok. Thanks for running those commands, since those are returning results, the next step here is to take a closer look at the exact check command for the host that is failing.
Can you PM me your system profile along with the name of host in the image that was uploaded?
To send us your system profile:
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and upload it to a cloud storage of your choice. You can share a link with me in a personal message.
After you upload the profile please post something in this thread to bring it up in the support queue.
Re: Host check - CRITICAL - Socket timeout
Posted: Fri Dec 14, 2018 5:15 pm
by Sampath.Basireddy
@benjaminsmith, pm'd you the system profile and hostname.
Re: Host check - CRITICAL - Socket timeout
Posted: Mon Dec 17, 2018 12:23 pm
by benjaminsmith
Hi
@Sampath.Basireddy,
What's interesting here is both the host check and the service check are using the check_http command. One is working and the other is failing. I believe this is either because port 80 is not available or a DNS issue with the domain name of the host for the host check.
Can you run an nmap scan to see if port 80 is open, and post the output?
Try running the host check from the command line in Nagios XI.
Code: Select all
cd /usr/local/nagios/libexec
su nagios
./check_http -H <ip address>
# SSL options
./check_http -H <ip address> -S
./check_http -H <domain name>
The check_http Plugin
https://www.monitoring-plugins.org/doc/ ... _http.html
Re: Host check - CRITICAL - Socket timeout
Posted: Mon Dec 17, 2018 1:00 pm
by Sampath.Basireddy
@ benjaminsmith,
Port 80 is open.
Code: Select all
nagios@nagiossrv1:[~]: nmap 10.40.21.145
Starting Nmap 6.47 ( http://nmap.org ) at 2018-12-17 12:26 EST
Nmap scan report for 10.40.21.145
Host is up (0.14s latency).
Not shown: 998 closed ports
PORT STATE SERVICE
80/tcp open http
443/tcp open https
Code: Select all
nagios@nagiossrv1:[/usr/local/nagios/libexec]: ./check_http -H 10.40.21.145
CRITICAL - Socket timeout
nagios@nagiossrv1:[/usr/local/nagios/libexec]: ./check_http -H 10.40.21.145 -S
HTTP OK: HTTP/1.1 200 - 11592 bytes in 0.206 second response time |time=0.205853s;;;0.000000 size=11592B;;;0
I updated the host check to use check_http command with "-S" argument and it is working fine.
When I add an URL Content check, it automatically adds host check and isn't that check supposed to be just a ping?
Re: Host check - CRITICAL - Socket timeout
Posted: Mon Dec 17, 2018 2:00 pm
by benjaminsmith
When I add an URL Content check, it automatically adds host check and isn't that check supposed to be just a ping?
That particular wizard in Nagios XI uses check_xi_host_http (check_http) to determine if the host is up. You can change this in the Core Configuration Manager if you'd like.