NagiosXI Socket Timeout Issue

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
sievers
Posts: 48
Joined: Tue May 24, 2011 7:34 am

Re: NagiosXI Socket Timeout Issue

Post by sievers »

Code: Select all

define service {
	host_name			L2 - CMDC001SVFS01,L2 - CMDC001SVLIC1,L2 - PXDC001SVFS01,L2 - PXLU001SVSQL1
	service_description		Disk E: 85 95
	use				xiwizard_windowsserver_nsclient_service
	check_command			check_xi_service_nsclient!*********PASSWORD*****!USEDDISKSPACE!-l E -w 85 -c 95!!!!!
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			24x7
	notification_interval		60
	notification_period		24x7
	contact_groups			GH_User_Server,PX_User_Server_ALL
	register			1
	}	

Code: Select all

define service {
	host_name			L2 - GHDC001SVPYR1,L2 - PXLU001SVFPM1
	service_description		SQL Server SQLEXPRESS
	use				xiwizard_windowsserver_nsclient_service
	display_name			SQL Server (SQLEXPRESS)
	check_command			check_xi_service_nsclient!**Password***!SERVICESTATE!-l MSSQL\\$$SQLEXPRESS -d SHOWALL!!!!!
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			24x7
	notification_interval		60
	notification_period		24x7
	contact_groups			GH_User_Server,PX_User_Server_ALL
	register			1
	}	

Code: Select all

define service {
	host_name			L2 - 11LU001SVDC01,L2 - BSLU001SVDC01,L2 - CMDC001SVDC01,L2 - CMDC001SVFS01,L2 - CMDC001SVLIC1,L2 - CMDC001SVSQL1,L2 - EDDC001SVEC01,L2 - EDLU001SVDC01,L2 - GHDC001SVPYR1,L2 - GHLU001SVCSE1,L2 - GHLU001SVEC01,L2 - NGLU001SVDC01,L2 - PXDC001DTVLA1,L2 - PXDC001SVCSE1,L2 - PXDC001SVDC01,L2 - PXDC001SVES01,L2 - PXDC001SVFS01,L2 - PXDC001SVPV01,L2 - PXDC001SVPV02,L2 - PXDC001SVRAD1,L2 - PXDC001SVSPA1,L2 - PXDC001SVSPF1,L2 - PXDC001SVSPF2,L2 - PXDC001SVSQL2,L2 - PXDC001SVWUS1,L2 - PXDC001SVXA93,L2 - PXDC001SVXAC1,L2 - PXDC001SVXAC2,L2 - PXLU001SVAC03,L2 - PXLU001SVAD01,L2 - PXLU001SVDC01,L2 - PXLU001SVDMB1,L2 - PXLU001SVEMP2,L2 - PXLU001SVFPM1,L2 - PXLU001SVSEP1,L2 - pxlu001svsim1,L2 - PXLU001SVSQL1,L2 - PXLU001SVSQL2,L2 - PXLU001SVSQL3,L2 - PXLU001SVSRT1,L2 - PXLU001VW301,L4 - PXLU003SVIS01,L4 - PXLU003SVIS03
	service_description		RAM Usage 80 90
	use				xiwizard_windowsserver_nsclient_service
	check_command			check_xi_service_nsclient!**Password**!MEMUSE!-w 80 -c 90!!!!!
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			24x7
	notification_interval		60
	notification_period		24x7
	contact_groups			GH_User_Server,PX_User_Server_ALL
	register			1
	}
This is just a small sample of random services which goto the socket timeout

Also I ran a Test on the NSCLIENT, here is the output
nsclint.PNG
You do not have the required permissions to view the files attached to this post.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: NagiosXI Socket Timeout Issue

Post by abrist »

Run nmap against the problematic server, specifying port 12489:

Code: Select all

nmap <host> -p 12489
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
sievers
Posts: 48
Joined: Tue May 24, 2011 7:34 am

Re: NagiosXI Socket Timeout Issue

Post by sievers »

Code: Select all

Starting Nmap 5.51 ( http://nmap.org ) at 2014-01-13 20:13 CET                  
Nmap scan report for 10.180.2.78                                                
Host is up (0.00066s latency).                                                  
PORT      STATE SERVICE                                                         
12489/tcp open  unknown                                                         
MAC Address: 00:50:56:8C:00:19 (VMware)  
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: NagiosXI Socket Timeout Issue

Post by slansing »

Can you verify if nsclient is logging? Navigate to it's installation directory and check for a nsclient.log file, if it is there please attach it here, or a snip of the last 100 lines. If it is not present, please open the nsclient/nsc.ini file and enable logging, then save the .ini file, and restart the service. Now wait a few minutes for some of your checks from nagios to come through and share the log file. Thanks!
sievers
Posts: 48
Joined: Tue May 24, 2011 7:34 am

Re: NagiosXI Socket Timeout Issue

Post by sievers »

Code: Select all

2014-01-13 18:54:21: message:D:\source\nscp\master\nscp\include\settings/impl/settings_ini.hpp:299: Configuration file not found: ini
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:387: NSClient++ 0,4,2,66 2013-12-05 x64 Loading settings and logger...
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\helpers\settings_manager\settings_manager_impl.cpp:146: Boot.ini found in: C:\Program Files\NSClient++\/boot.ini
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\helpers\settings_manager\settings_manager_impl.cpp:164: Activating: ini://${exe-path}/nsclient.ini
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\helpers\settings_manager\settings_manager_impl.cpp:71: Creating instance for: ini://${exe-path}/nsclient.ini
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\include\settings/impl/settings_ini.hpp:254: Loading: C:\Program Files\NSClient++\/nsclient.ini
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:396: NSClient++ 0,4,2,66 2013-12-05 x64 booting...
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:397: Booted settings subsystem...
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:464: On crash: restart: Nsclientpp
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:476: Archiving crash dumps in: /crash-dumps
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:543: booting::loading plugins
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: CheckDisk
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: CheckEventLog
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: CheckExternalScripts
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: CheckHelpers
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: CheckNSCP
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: CheckSystem
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: CheckWMI
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: NRPEServer
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: NSClientServer
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\CheckDisk.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\CheckEventLog.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\CheckExternalScripts.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\CheckHelpers.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\CheckNSCP.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\CheckSystem.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\CheckWMI.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\NRPEServer.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\NSClientServer.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: CheckDisk
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: CheckEventLog
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: CheckExternalScripts
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\modules\CheckExternalScripts\CheckExternalScripts.cpp:89: No wrappings found (adding default: vbs, ps1 and bat)
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\modules\CheckExternalScripts\CheckExternalScripts.cpp:99: No aliases found (adding default)
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: CheckHelpers
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: CheckNSCP
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\modules\CheckNSCP\CheckNSCP.cpp:60: Crash folder is: /crash-dumps
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: CheckSystem
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: CheckWMI
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: NRPEServer
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\modules\NRPEServer\NRPEServer.cpp:92: Allowed hosts definition: 10.180.2.54(255.255.255.255)
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\include\socket/server.hpp:94: Binding to: [::]:5666(ipv6)
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\include\socket/server.hpp:197: Attempting to bind to: :5666
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\include\socket/server.hpp:90: Binding to: 0.0.0.0:5666(ipv4)
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\include\socket/server.hpp:197: Attempting to bind to: :5666
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: NSClientServer
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\modules\NSClientServer\NSClientServer.cpp:90: Allowed hosts definition: 10.180.2.54(255.255.255.255)
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\include\socket/server.hpp:94: Binding to: [::]:12489(ipv6)
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\include\socket/server.hpp:197: Attempting to bind to: :12489
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\include\socket/server.hpp:90: Binding to: 0.0.0.0:12489(ipv4)
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\include\socket/server.hpp:197: Attempting to bind to: :12489
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:631: NSClient++ - 0,4,2,66 2013-12-05 Started!
2014-01-13 19:37:21: message:d:\source\nscp\master\nscp\service\simple_client.hpp:37: Enter command to inject or exit to terminate...
2014-01-13 20:44:22: error:D:\source\nscp\master\nscp\include\socket/server.hpp:253: Socket ERROR: The I/O operation has been aborted because of either a thread exit or an application request
2014-01-13 20:44:22: error:D:\source\nscp\master\nscp\include\socket/server.hpp:253: Socket ERROR: The I/O operation has been aborted because of either a thread exit or an application request
2014-01-13 20:44:22: error:D:\source\nscp\master\nscp\include\socket/server.hpp:253: Socket ERROR: The I/O operation has been aborted because of either a thread exit or an application request
The last three checks where actual checks
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: NagiosXI Socket Timeout Issue

Post by slansing »

This caught my eye:

Code: Select all

Socket ERROR: The I/O operation has been aborted because of either a thread exit or an application request
This seems to be quite a rare error, I could only find it mentioned a couple of places. It seems like it resolved itself after service restarts:

http://nsclient.org/nscp/discussion/topic/1130

What version of Windows is this system running? Was the same windows update applied to these servers as the issue spread?
sievers
Posts: 48
Joined: Tue May 24, 2011 7:34 am

Re: NagiosXI Socket Timeout Issue

Post by sievers »

The Servers are all running Windows Server 2008R2, however they are not of the same patchlevel. Unfortunetly restarting the service does not help. I also tried installing older versions of the nsclient...same problem
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: NagiosXI Socket Timeout Issue

Post by slansing »

Quick question, in your second post, are the drives you showed in a critical state networked storage? It appears as though most of the checks that are timing out are custom plugins on the windows systems and the basic nsclient checks are working properly, is this the case? Sometimes it is a bit hard to tell by just service names.
sievers
Posts: 48
Joined: Tue May 24, 2011 7:34 am

Re: NagiosXI Socket Timeout Issue

Post by sievers »

Those are all normal local disks. Like I said, its random, sometimes its a disk, sometimes its a random service. I even had the Uptime Service go into socket timeout.

Also restarting the service does not help, not even reinstalling the NSClient helps. The socket error seems to be NSClient version independent since I even tried using very old nsclient versions.
sievers
Posts: 48
Joined: Tue May 24, 2011 7:34 am

Re: NagiosXI Socket Timeout Issue

Post by sievers »

Due this problem our whole installation is no longer usable, since we are getting so many false positives with socket error.

Also I tried installing a 32bit version on a 64bit system. The Problem however persisted
Locked