(Service Check Timed Out) with a custom perl script

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
ipcthorn
Posts: 4
Joined: Tue Jul 24, 2012 11:40 am

(Service Check Timed Out) with a custom perl script

Post by ipcthorn »

I have been fighting with a custom perl script that sends a test transaction through a credit card processing API. We run it every minute and have the retry set to 30 seconds. I know this is pretty frequent but is necessary for our business to know ASAP about a failure. About once or twice an evening we get a false positive with the result of "(Service Check Timed Out)" but based on the logging in our script, the script is never even invoked. You can see in the logs below the script runs at 2:40 and 2:42 but not at 2:41.

[Tue Jul 24 02:40:37 2012] Script started
[Tue Jul 24 02:40:37 2012] Using config file /usr/local/nagios/libexec/config-autoresp-10.xml
[Tue Jul 24 02:40:37 2012] ServiceKey: C37875FC7F91300C
[Tue Jul 24 02:40:37 2012] IdentToken found
[Tue Jul 24 02:40:37 2012] BaseURL: https://cws-01.ipcommerce.com
[Tue Jul 24 02:40:37 2012] SignOn SOAP message found
[Tue Jul 24 02:40:37 2012] Authorize SOAP message found
[Tue Jul 24 02:40:37 2012] URL Postfix: /2.0
[Tue Jul 24 02:40:37 2012] using BaseURL: https://cws-01.ipcommerce.com/2.0
[Tue Jul 24 02:40:37 2012] SAK found in ident token: C37875FC7F91300C
[Tue Jul 24 02:40:37 2012] signon duration 0.422683
[Tue Jul 24 02:40:37 2012] signon response code:200
[Tue Jul 24 02:40:37 2012] Session token found in SignOn response:C37875FC7F91300C
[Tue Jul 24 02:40:38 2012] txn duration: 0.803918
[Tue Jul 24 02:40:38 2012] txn response code 200 response msg: <REMOVED XML>
[Tue Jul 24 02:40:38 2012] Script ended
[Tue Jul 24 02:42:37 2012] Script started
[Tue Jul 24 02:42:37 2012] Using config file /usr/local/nagios/libexec/config-autoresp-10.xml
[Tue Jul 24 02:42:37 2012] ServiceKey: C37875FC7F91300C
[Tue Jul 24 02:42:37 2012] IdentToken found
[Tue Jul 24 02:42:37 2012] BaseURL: https://cws-01.ipcommerce.com
[Tue Jul 24 02:42:37 2012] SignOn SOAP message found
[Tue Jul 24 02:42:37 2012] Authorize SOAP message found
[Tue Jul 24 02:42:37 2012] URL Postfix: /2.0
[Tue Jul 24 02:42:37 2012] using BaseURL: https://cws-01.ipcommerce.com/2.0
[Tue Jul 24 02:42:37 2012] SAK found in ident token: C37875FC7F91300C
[Tue Jul 24 02:42:37 2012] signon duration 0.372244
[Tue Jul 24 02:42:37 2012] signon response code:200
[Tue Jul 24 02:42:37 2012] Session token found in SignOn response:C37875FC7F91300C
[Tue Jul 24 02:42:38 2012] txn duration: 0.756385
[Tue Jul 24 02:42:38 2012] txn response code 200 response msg: <REMOVED XML>
[Tue Jul 24 02:42:38 2012] Script ended

This is the output from nagios:

[07-24-2012 02:42:47] SERVICE ALERT: cws-01.ipcommerce.com;Check CWS PSP v10+;OK;HARD;1;OK: At https://cws-01.ipcommerce.com/2.0/Txn/C37875FC7F91300C
[07-24-2012 02:42:37] SERVICE ALERT: cws-01.ipcommerce.com;Check CWS PSP v10+;CRITICAL;HARD;1;(Service Check Timed Out)

I'm just looking for any advice on this to check/try, waking a bunch of people up at random times in the night isn't ideal. It doesn't happen at a consistent time but the weird thing is is never happens during the day when we are in the office. We even moved the check to a fresh nagios install it the issue happens there as well. Thanks in advance for any help.
User avatar
nscott
Posts: 1040
Joined: Wed May 11, 2011 8:54 am

Re: (Service Check Timed Out) with a custom perl script

Post by nscott »

Do you have multiples instances of nagios running? It would be apparent than a check timing out every once and a while I think, but its possible, and it doesn't really make a lot of sense that it wouldn't be logging the advent. Is there some passive check being sent thats causing that error, because that might not show up in the logs.
Nicholas Scott
Former Nagios employee
ipcthorn
Posts: 4
Joined: Tue Jul 24, 2012 11:40 am

Re: (Service Check Timed Out) with a custom perl script

Post by ipcthorn »

Only one instance of nagios per server. We are running this check once a min on two separate servers in the same environment.

No passive checks.
agriffin
Posts: 876
Joined: Mon May 09, 2011 9:36 am

Re: (Service Check Timed Out) with a custom perl script

Post by agriffin »

Did you check for multiple instances? Because if that was the case it would be inadvertent, not intentional (sorry if you actually did check, but the way you worded your post made it unclear). Try running 'pgrep -l nagios' and see if there are multiple nagios processes. Also, are you using the embedded perl interpreter?
ipcthorn
Posts: 4
Joined: Tue Jul 24, 2012 11:40 am

Re: (Service Check Timed Out) with a custom perl script

Post by ipcthorn »

There were two results from that command but one was a child process of the other (based on the analysis of a more linux savy co-worker) so I don't think that's it.

How could I tell if I'm using the embedded interpreter or not?
Locked