Page 2 of 3

Re: NagiosXI Socket Timeout Issue

Posted: Mon Jan 13, 2014 1:43 pm
by sievers

Code: Select all

define service {
	host_name			L2 - CMDC001SVFS01,L2 - CMDC001SVLIC1,L2 - PXDC001SVFS01,L2 - PXLU001SVSQL1
	service_description		Disk E: 85 95
	use				xiwizard_windowsserver_nsclient_service
	check_command			check_xi_service_nsclient!*********PASSWORD*****!USEDDISKSPACE!-l E -w 85 -c 95!!!!!
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			24x7
	notification_interval		60
	notification_period		24x7
	contact_groups			GH_User_Server,PX_User_Server_ALL
	register			1
	}	

Code: Select all

define service {
	host_name			L2 - GHDC001SVPYR1,L2 - PXLU001SVFPM1
	service_description		SQL Server SQLEXPRESS
	use				xiwizard_windowsserver_nsclient_service
	display_name			SQL Server (SQLEXPRESS)
	check_command			check_xi_service_nsclient!**Password***!SERVICESTATE!-l MSSQL\\$$SQLEXPRESS -d SHOWALL!!!!!
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			24x7
	notification_interval		60
	notification_period		24x7
	contact_groups			GH_User_Server,PX_User_Server_ALL
	register			1
	}	

Code: Select all

define service {
	host_name			L2 - 11LU001SVDC01,L2 - BSLU001SVDC01,L2 - CMDC001SVDC01,L2 - CMDC001SVFS01,L2 - CMDC001SVLIC1,L2 - CMDC001SVSQL1,L2 - EDDC001SVEC01,L2 - EDLU001SVDC01,L2 - GHDC001SVPYR1,L2 - GHLU001SVCSE1,L2 - GHLU001SVEC01,L2 - NGLU001SVDC01,L2 - PXDC001DTVLA1,L2 - PXDC001SVCSE1,L2 - PXDC001SVDC01,L2 - PXDC001SVES01,L2 - PXDC001SVFS01,L2 - PXDC001SVPV01,L2 - PXDC001SVPV02,L2 - PXDC001SVRAD1,L2 - PXDC001SVSPA1,L2 - PXDC001SVSPF1,L2 - PXDC001SVSPF2,L2 - PXDC001SVSQL2,L2 - PXDC001SVWUS1,L2 - PXDC001SVXA93,L2 - PXDC001SVXAC1,L2 - PXDC001SVXAC2,L2 - PXLU001SVAC03,L2 - PXLU001SVAD01,L2 - PXLU001SVDC01,L2 - PXLU001SVDMB1,L2 - PXLU001SVEMP2,L2 - PXLU001SVFPM1,L2 - PXLU001SVSEP1,L2 - pxlu001svsim1,L2 - PXLU001SVSQL1,L2 - PXLU001SVSQL2,L2 - PXLU001SVSQL3,L2 - PXLU001SVSRT1,L2 - PXLU001VW301,L4 - PXLU003SVIS01,L4 - PXLU003SVIS03
	service_description		RAM Usage 80 90
	use				xiwizard_windowsserver_nsclient_service
	check_command			check_xi_service_nsclient!**Password**!MEMUSE!-w 80 -c 90!!!!!
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			24x7
	notification_interval		60
	notification_period		24x7
	contact_groups			GH_User_Server,PX_User_Server_ALL
	register			1
	}
This is just a small sample of random services which goto the socket timeout

Also I ran a Test on the NSCLIENT, here is the output
nsclint.PNG

Re: NagiosXI Socket Timeout Issue

Posted: Mon Jan 13, 2014 2:09 pm
by abrist
Run nmap against the problematic server, specifying port 12489:

Code: Select all

nmap <host> -p 12489

Re: NagiosXI Socket Timeout Issue

Posted: Mon Jan 13, 2014 2:21 pm
by sievers

Code: Select all

Starting Nmap 5.51 ( http://nmap.org ) at 2014-01-13 20:13 CET                  
Nmap scan report for 10.180.2.78                                                
Host is up (0.00066s latency).                                                  
PORT      STATE SERVICE                                                         
12489/tcp open  unknown                                                         
MAC Address: 00:50:56:8C:00:19 (VMware)  

Re: NagiosXI Socket Timeout Issue

Posted: Mon Jan 13, 2014 2:38 pm
by slansing
Can you verify if nsclient is logging? Navigate to it's installation directory and check for a nsclient.log file, if it is there please attach it here, or a snip of the last 100 lines. If it is not present, please open the nsclient/nsc.ini file and enable logging, then save the .ini file, and restart the service. Now wait a few minutes for some of your checks from nagios to come through and share the log file. Thanks!

Re: NagiosXI Socket Timeout Issue

Posted: Mon Jan 13, 2014 2:45 pm
by sievers

Code: Select all

2014-01-13 18:54:21: message:D:\source\nscp\master\nscp\include\settings/impl/settings_ini.hpp:299: Configuration file not found: ini
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:387: NSClient++ 0,4,2,66 2013-12-05 x64 Loading settings and logger...
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\helpers\settings_manager\settings_manager_impl.cpp:146: Boot.ini found in: C:\Program Files\NSClient++\/boot.ini
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\helpers\settings_manager\settings_manager_impl.cpp:164: Activating: ini://${exe-path}/nsclient.ini
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\helpers\settings_manager\settings_manager_impl.cpp:71: Creating instance for: ini://${exe-path}/nsclient.ini
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\include\settings/impl/settings_ini.hpp:254: Loading: C:\Program Files\NSClient++\/nsclient.ini
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:396: NSClient++ 0,4,2,66 2013-12-05 x64 booting...
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:397: Booted settings subsystem...
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:464: On crash: restart: Nsclientpp
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:476: Archiving crash dumps in: /crash-dumps
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:543: booting::loading plugins
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: CheckDisk
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: CheckEventLog
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: CheckExternalScripts
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: CheckHelpers
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: CheckNSCP
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: CheckSystem
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: CheckWMI
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: NRPEServer
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:305: Found: NSClientServer
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\CheckDisk.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\CheckEventLog.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\CheckExternalScripts.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\CheckHelpers.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\CheckNSCP.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\CheckSystem.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\CheckWMI.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\NRPEServer.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:881: adding C:\Program Files\NSClient++\/modules\NSClientServer.dll
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: CheckDisk
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: CheckEventLog
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: CheckExternalScripts
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\modules\CheckExternalScripts\CheckExternalScripts.cpp:89: No wrappings found (adding default: vbs, ps1 and bat)
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\modules\CheckExternalScripts\CheckExternalScripts.cpp:99: No aliases found (adding default)
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: CheckHelpers
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: CheckNSCP
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\modules\CheckNSCP\CheckNSCP.cpp:60: Crash folder is: /crash-dumps
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: CheckSystem
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: CheckWMI
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: NRPEServer
2014-01-13 19:37:20: debug:D:\source\nscp\master\nscp\modules\NRPEServer\NRPEServer.cpp:92: Allowed hosts definition: 10.180.2.54(255.255.255.255)
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\include\socket/server.hpp:94: Binding to: [::]:5666(ipv6)
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\include\socket/server.hpp:197: Attempting to bind to: :5666
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\include\socket/server.hpp:90: Binding to: 0.0.0.0:5666(ipv4)
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\include\socket/server.hpp:197: Attempting to bind to: :5666
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:857: Loading plugin: NSClientServer
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\modules\NSClientServer\NSClientServer.cpp:90: Allowed hosts definition: 10.180.2.54(255.255.255.255)
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\include\socket/server.hpp:94: Binding to: [::]:12489(ipv6)
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\include\socket/server.hpp:197: Attempting to bind to: :12489
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\include\socket/server.hpp:90: Binding to: 0.0.0.0:12489(ipv4)
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\include\socket/server.hpp:197: Attempting to bind to: :12489
2014-01-13 19:37:21: debug:D:\source\nscp\master\nscp\service\NSClient++.cpp:631: NSClient++ - 0,4,2,66 2013-12-05 Started!
2014-01-13 19:37:21: message:d:\source\nscp\master\nscp\service\simple_client.hpp:37: Enter command to inject or exit to terminate...
2014-01-13 20:44:22: error:D:\source\nscp\master\nscp\include\socket/server.hpp:253: Socket ERROR: The I/O operation has been aborted because of either a thread exit or an application request
2014-01-13 20:44:22: error:D:\source\nscp\master\nscp\include\socket/server.hpp:253: Socket ERROR: The I/O operation has been aborted because of either a thread exit or an application request
2014-01-13 20:44:22: error:D:\source\nscp\master\nscp\include\socket/server.hpp:253: Socket ERROR: The I/O operation has been aborted because of either a thread exit or an application request
The last three checks where actual checks

Re: NagiosXI Socket Timeout Issue

Posted: Mon Jan 13, 2014 5:40 pm
by slansing
This caught my eye:

Code: Select all

Socket ERROR: The I/O operation has been aborted because of either a thread exit or an application request
This seems to be quite a rare error, I could only find it mentioned a couple of places. It seems like it resolved itself after service restarts:

http://nsclient.org/nscp/discussion/topic/1130

What version of Windows is this system running? Was the same windows update applied to these servers as the issue spread?

Re: NagiosXI Socket Timeout Issue

Posted: Tue Jan 14, 2014 1:44 am
by sievers
The Servers are all running Windows Server 2008R2, however they are not of the same patchlevel. Unfortunetly restarting the service does not help. I also tried installing older versions of the nsclient...same problem

Re: NagiosXI Socket Timeout Issue

Posted: Tue Jan 14, 2014 10:52 am
by slansing
Quick question, in your second post, are the drives you showed in a critical state networked storage? It appears as though most of the checks that are timing out are custom plugins on the windows systems and the basic nsclient checks are working properly, is this the case? Sometimes it is a bit hard to tell by just service names.

Re: NagiosXI Socket Timeout Issue

Posted: Wed Jan 15, 2014 1:54 am
by sievers
Those are all normal local disks. Like I said, its random, sometimes its a disk, sometimes its a random service. I even had the Uptime Service go into socket timeout.

Also restarting the service does not help, not even reinstalling the NSClient helps. The socket error seems to be NSClient version independent since I even tried using very old nsclient versions.

Re: NagiosXI Socket Timeout Issue

Posted: Wed Jan 15, 2014 9:53 am
by sievers
Due this problem our whole installation is no longer usable, since we are getting so many false positives with socket error.

Also I tried installing a 32bit version on a 64bit system. The Problem however persisted