Script works and stops working with Nagios + NSClient

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
brianp89
Posts: 18
Joined: Thu May 08, 2014 10:53 am

Script works and stops working with Nagios + NSClient

Post by brianp89 »

I tried doing some searches and found one or two others with a similar issue:

http://webcache.googleusercontent.com/s ... clnk&gl=us

Unfortunately I can't get past page 1 to see if there was a solution on page 2...

Anyways, I've got the same issue where my scripts work fine and then something goes wrong after a few runs.

Here are some of the NSClient logs of where the script goes bad:

Code: Select all

2014-09-08 08:41:15: d:..\..\..\..\trunk\modules\CheckExternalScripts\CheckExternalScripts.cpp:249: Command line: cmd /c echo scripts\check_exchange_mailflow.ps1 "exch-mb3" "15" "20"; exit($lastexitcode) | powershell.exe -command -
2014-09-08 08:41:52: d:D:\source\nscp\trunk\include\nrpe/server/protocol.hpp:66: Accepting connection from: 168.141.2.23
2014-09-08 08:42:07: d:..\..\..\trunk\service\NSClient++.cpp:985: Result check_exchange_mailflow: OK
2014-09-08 08:42:07: e:D:\source\nscp\trunk\include\socket/connection.hpp:146: Failed to send data: The file handle supplied is not valid


2014-09-08 08:10:19: d:..\..\..\..\trunk\modules\CheckExternalScripts\CheckExternalScripts.cpp:249: Command line: cmd /c echo scripts\check_exchange_mailflow.ps1 "exch-mb3" "15" "20"; exit($lastexitcode) | powershell.exe -command -
2014-09-08 08:10:52: d:D:\source\nscp\trunk\include\nrpe/server/protocol.hpp:66: Accepting connection from: 168.141.2.23
2014-09-08 08:11:06: d:D:\source\nscp\trunk\include\nrpe/server/protocol.hpp:66: Accepting connection from: 168.141.2.23
2014-09-08 08:11:10: d:..\..\..\trunk\service\NSClient++.cpp:985: Result check_exchange_mailflow: OK
2014-09-08 08:11:10: e:D:\source\nscp\trunk\include\socket/connection.hpp:146: Failed to send data: The file handle supplied is not valid


2014-09-08 07:07:19: d:..\..\..\..\trunk\modules\CheckExternalScripts\CheckExternalScripts.cpp:249: Command line: cmd /c echo scripts\check_exchange_mailflow.ps1 "exch-mb3" "15" "20"; exit($lastexitcode) | powershell.exe -command -
2014-09-08 07:07:52: d:D:\source\nscp\trunk\include\nrpe/server/protocol.hpp:66: Accepting connection from: 168.141.2.23
2014-09-08 07:07:56: d:..\..\..\trunk\service\NSClient++.cpp:985: Result check_exchange_mailflow: OK
2014-09-08 07:07:56: e:D:\source\nscp\trunk\include\socket/connection.hpp:146: Failed to send data: The file handle supplied is not valid

And here's what Nagios reports:

Code: Select all

2014-09-08 08:11:22SERVICE ALERT: exch-mb3;Exchange Mailbox Test;OK;SOFT;2;exch-mb3 Mailbox OK.

2014-09-08 08:10:52SERVICE ALERT: exch-mb3;Exchange Mailbox Test;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 30 seconds.

2014-09-08 07:40:32SERVICE ALERT: exch-mb3;Exchange Mailbox Test;OK;SOFT;2;exch-mb3 Mailbox OK.

2014-09-08 07:38:52SERVICE ALERT: exch-mb3;Exchange Mailbox Test;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 30 seconds.

2014-09-08 07:08:22SERVICE ALERT: exch-mb3;Exchange Mailbox Test;OK;SOFT;2;exch-mb3 Mailbox OK.

2014-09-08 07:07:52SERVICE ALERT: exch-mb3;Exchange Mailbox Test;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 30 seconds.
In Nagios I have this as a single service check, checking 4 servers total: exch-mb1, exch-mb2, exch-mb3, and exch-mb4. The script works fine with the three other servers but repeatedly errors with exch-mb3 which it seems to time out and then recheck again 2 minutes later where it then reports OK. When I run the script manually, it works without issue.. Exch-mb3 seems to be fine. This leaves me to believe there's something wrong with NSClient.

Another similar post:
http://www.nsclient.org/forums/topic/old-1171/

I even added an -ExecutionTimeout option to my script which tells it to stop after 20 seconds.. I've tested it and it works when I run the script manually. Yet the NSClient check seems to take longer than a minute which doesn't make much sense to me because it should be timing out well before then.. Any help would be appreciated! TIA
brianp89
Posts: 18
Joined: Thu May 08, 2014 10:53 am

Re: Script works and stops working with Nagios + NSClient

Post by brianp89 »

After adding some debug output code into my script to see where it's hanging, it appears NSClient is getting stuck on importing a PSSession (shows that it's taking up to 30 seconds in some cases):

Code: Select all

try
{
	#Create session variable for Exchange Powershell
	$Session = New-PSSession -ConfigurationName Microsoft.Exchange -ConnectionUri http://$exch_mb/PowerShell/ -Authentication Kerberos -ErrorAction Stop

	#Attempt to import the connection
	Import-PSSession $Session -AllowClobber >$null
}
catch 
{
	write-host Connection failed to $exch_mb -ForegroundColor Yellow
	exit 3
}
I'll keep digging and let you know if I find a solution..


EDIT:

Turns out "Exch-mb3" has a lot more exchange commands than the other servers for some reason. Because of this, it was taking longer to import all the powershell commands.
I was able to fix this by specifying which command to load:

Code: Select all

Import-PSSession $Session -AllowClobber -CommandName Test-MailFlow >$null
Now it's much faster! Sometimes just writing everything out can help me solve the problem :)
Feel free to lock or delete this topic.
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Script works and stops working with Nagios + NSClient

Post by slansing »

Ahhh, yeah, sometimes all you need is a sounding board/rubber duck! Glad it's all working ship-shape now.
Locked