how can solve Socket timeout problem

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
baber
Posts: 308
Joined: Wed Oct 21, 2015 4:39 am

how can solve Socket timeout problem

Post by baber »

dear all

Hi
i have a windows server but sometimes 1 or 2 sensors show this message in nagios server (CRITICAL - Socket timeout after 10 seconds)
but other services are ok and this message usaually appear every 10min and after that is ok

how can solve this problem ?

please see attach pic in first time pic1 show and after few minutes pic2 appear and all is ok but after few minutes again appear pic1

this problem just appear for 2 or 3 servers of 110 server

what is problem?
Attachments
pic2.jpg
pic1.jpg
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: how can solve Socket timeout problem

Post by mcapra »

If this is using check_nt, you could try altering your Service/Command definitions to have a higher timeout threshold. Something like -t 30 might be sufficient.

If you're not sure how to go about that, can you show us the associated Service and Command definitions that are having problems?
Former Nagios employee
https://www.mcapra.com/
baber
Posts: 308
Joined: Wed Oct 21, 2015 4:39 am

Re: how can solve Socket timeout problem

Post by baber »

mcapra wrote:If this is using check_nt, you could try altering your Service/Command definitions to have a higher timeout threshold. Something like -t 30 might be sufficient.

If you're not sure how to go about that, can you show us the associated Service and Command definitions that are having problems?
Thanks
I used -t 240 but for that sensor show this message

connect to address x.x.x.x and port 12489: Connection timed out
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: how can solve Socket timeout problem

Post by rkennedy »

Can you attach the nsclient.log file from the client machine, and also show us your entire service definition?
Former Nagios Employee
baber
Posts: 308
Joined: Wed Oct 21, 2015 4:39 am

Re: how can solve Socket timeout problem

Post by baber »

rkennedy wrote:Can you attach the nsclient.log file from the client machine, and also show us your entire service definition?
atach critical error appear

and i have attached nsclient.log and server.cfg file

Code: Select all

define host{
	use		windows-server	; Inherit default values from a template
	host_name	onlinecard_cdc1	; The name we're giving to this host
	alias		server onlinecard_cdc1	; A longer name associated with the host
	address		x.x.x.x		; IP address of the host
	}


define service{
	use			local-service
	host_name		onlinecard_cdc1
	service_description	Memory Usage
	check_command		check_nt!MEMUSE!-w 90 -c 95
	}

define service{
	use			local-service
	host_name		onlinecard_cdc1
	service_description	 Service
	check_command		check_nt!SERVICESTATE!-d SHOWALL -l SNMP -t 240
	}

define service{
	use			local-service
	host_name		onlinecard_cdc1
	service_description	Cpu Usage
	check_command		check_nrpe!alias_cpu
	}

define service{
	use			local-service
	host_name		onlinecard_cdc1
	service_description	Disk Space
	check_command		check_nrpe!alias_volumes
	}

define service{
	use			local-service
	host_name		onlinecard_cdc1
	service_description	Time
	check_command		check_nrpe!check_windows_time
	}
define service{
        use                     generic-service
        host_name               onlinecard_cdc1
        service_description     Uptime
        check_command           check_nt!UPTIME -t 180
        }


define service{
        use                     generic-service
        host_name               onlinecard_cdc1
        service_description     CPU Load
        check_command           check_nt!CPULOAD!-l 5,80,90 
	}
-t is not useful because i used that for Service but after a few minutes show this error

connect to address x.x.x.x and port 12489: Connection timed out
Attachments
cpu.jpg
nsclient.log
(2.37 KiB) Downloaded 354 times
nsclient.ini
(15.48 KiB) Downloaded 323 times
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: how can solve Socket timeout problem

Post by rkennedy »

It looks like some of your checks that aren't failing are using NRPE, but the failing ones are using check_nt. Is anything blocking traffic on port 12489 periodically?
Former Nagios Employee
baber
Posts: 308
Joined: Wed Oct 21, 2015 4:39 am

Re: how can solve Socket timeout problem

Post by baber »

rkennedy wrote:It looks like some of your checks that aren't failing are using NRPE, but the failing ones are using check_nt. Is anything blocking traffic on port 12489 periodically?
all ports on my servers are open my servers has full port access i am realy confused
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: how can solve Socket timeout problem

Post by rkennedy »

From the command line of Nagios, please run the following commands and post back the full output (install nmap if you do not have it already) -

Code: Select all

nmap x.x.x.x -p 12489
nmap x.x.x.x
(replace x.x.x.x with the IP of the Windows machine we're troubleshooting here.)
Former Nagios Employee
baber
Posts: 308
Joined: Wed Oct 21, 2015 4:39 am

Re: how can solve Socket timeout problem

Post by baber »

rkennedy wrote:From the command line of Nagios, please run the following commands and post back the full output (install nmap if you do not have it already) -

Code: Select all

nmap x.x.x.x -p 12489
nmap x.x.x.x
(replace x.x.x.x with the IP of the Windows machine we're troubleshooting here.)

Code: Select all

nmap 10.4.1.144 -p 12489

Starting Nmap 5.21 ( http://nmap.org ) at 2016-08-18 00:08 IRDT
mass_dns: warning: Unable to determine any DNS servers. Reverse DNS is disabled. Try using --system-dns or specify valid servers with --dns-servers
Nmap scan report for 10.4.1.144
Host is up (0.00085s latency).
PORT      STATE    SERVICE
12489/tcp filtered unknown

Nmap done: 1 IP address (1 host up) scanned in 0.35 seconds

Code: Select all

nmap 10.0.4.123 -p 12489

Starting Nmap 5.21 ( http://nmap.org ) at 2016-08-18 00:11 IRDT
mass_dns: warning: Unable to determine any DNS servers. Reverse DNS is disabled. Try using --system-dns or specify valid servers with --dns-servers
Nmap scan report for 10.0.4.123
Host is up (0.00082s latency).
PORT      STATE SERVICE
12489/tcp open  unknown
MAC Address: 00:0C:29:4A:E1:B0 (VMware)

Nmap done: 1 IP address (1 host up) scanned in 0.17 seconds

Code: Select all

 nmap 10.0.4.119 -p 12489

Starting Nmap 5.21 ( http://nmap.org ) at 2016-08-18 00:13 IRDT
mass_dns: warning: Unable to determine any DNS servers. Reverse DNS is disabled. Try using --system-dns or specify valid servers with --dns-servers
Nmap scan report for 10.0.4.119
Host is up (0.0015s latency).
PORT      STATE SERVICE
12489/tcp open  unknown
MAC Address: 00:50:56:86:28:0D (VMware)

Nmap done: 1 IP address (1 host up) scanned in 0.17 seconds

rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: how can solve Socket timeout problem

Post by rkennedy »

For the record, which machine are we troubleshooting? It looks like you posted 3 separate nmap's.
Former Nagios Employee
Locked