[1391617699] nerd: Fully initialized and ready to rock!
[1391617699] wproc: Successfully registered manager as @wproc with query handler
[1391617700] wproc: Registry request: name=Core Worker 12668;pid=12668
[1391617700] wproc: Registry request: name=Core Worker 12669;pid=12669
[1391617700] wproc: Registry request: name=Core Worker 12670;pid=12670
[1391617700] wproc: Registry request: name=Core Worker 12671;pid=12671
[1391617700] wproc: Registry request: name=Core Worker 12672;pid=12672
[1391617700] wproc: Registry request: name=Core Worker 12673;pid=12673
[1391617706] Successfully launched command file worker with pid 12693
[1391617797] wproc: Core Worker 12668: job 52 (pid=13082) timed out. Killing it
[1391617797] wproc: CHECK job 52 from worker Core Worker 12668 timed out after 60.01s
[1391617797] wproc: command: /usr/bin/webinject.pl -c /usr/local/nagios/libexec/webinject/nagios.xml /usr/local/nagios/libexec/webinject/shdtest.xml -s URL=10.13.25.83 -s CLIENT=train0022 -s LOGIN=TR.monitoring
[1391617797] wproc: host=t003_node01; service=train0022;
[1391617797] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1391617797] Warning: Check of service 'train0022' on host 't003_node01' timed out after 60.006s!
[1391617797] wproc: Core Worker 12668: tv.tv_sec is currently 1391617797
[1391617797] wproc: Core Worker 12668: Failed to reap child with pid 13082. Next attempt @ 1391617802.422666
[1391617800] wproc: Core Worker 12671: job 56 (pid=13115) timed out. Killing it
[1391617800] wproc: CHECK job 56 from worker Core Worker 12671 timed out after 60.01s
[1391617800] wproc: command: /usr/bin/webinject.pl -c /usr/local/nagios/libexec/webinject/nagios.xml /usr/local/nagios/libexec/webinject/shdtest.xml -s URL=10.13.25.83 -s CLIENT=train0021 -s LOGIN=TR.monitoring
[1391617800] wproc: host=t003_node01; service=train0021;
[1391617800] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1391617800] Warning: Check of service 'train0021' on host 't003_node01' timed out after 60.006s!
[1391617800] wproc: Core Worker 12671: tv.tv_sec is currently 1391617800
[1391617800] wproc: Core Worker 12671: Failed to reap child with pid 13115. Next attempt @ 1391617805.415578
[1391617802] wproc: Core Worker 12668: job 52 (pid=13082): Dormant child reaped
[1391617806] wproc: Core Worker 12671: job 56 (pid=13115): Dormant child reaped
[1391617821] wproc: Core Worker 12668: job 97 (pid=13364) timed out. Killing it
[1391617821] wproc: CHECK job 97 from worker Core Worker 12668 timed out after 60.00s
[1391617821] wproc: command: /usr/bin/webinject.pl -c /usr/local/nagios/libexec/webinject/nagios.xml /usr/local/nagios/libexec/webinject/shdtest.xml -s URL=10.13.25.83 -s CLIENT=train0030 -s LOGIN=TR.monitoring
webinject erroring out for service and timing out
-
vivithemage
- Posts: 102
- Joined: Tue May 21, 2013 2:52 pm
webinject erroring out for service and timing out
For some reason this service is timing out, started last night some time. I tested the service and it is indeed running, I can telnet to the server over the configured IP with success as well. I am running 4.0.2.
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: webinject erroring out for service and timing out
Is it timing out every time it checks? Can you manually run the service's check command from your nagios CLI to see if it displays the same results? It could be a change in configuration, an issue with what it is checking on the remote host, the host not accepting it's connection, or even a networking issue. Are your other services working fine? Do you have this same check running against another system?
-
vivithemage
- Posts: 102
- Joined: Tue May 21, 2013 2:52 pm
Re: webinject erroring out for service and timing out
I tried manually, timed out.slansing wrote:Is it timing out every time it checks? Can you manually run the service's check command from your nagios CLI to see if it displays the same results? It could be a change in configuration, an issue with what it is checking on the remote host, the host not accepting it's connection, or even a networking issue. Are your other services working fine? Do you have this same check running against another system?
No configuration change was made on nagios, or the application. The host is accepting, as I can telnet to the port and IP. Other services are working fine, we actually run two of these in tandum, the second one is a different server though, but exactly the same webinject check. I did triple check the nagios configurations and they are accurate.
-
vivithemage
- Posts: 102
- Joined: Tue May 21, 2013 2:52 pm
Re: webinject erroring out for service and timing out
The 'sister' app passes fine:
The other app does not:
[root@vmmgmtappnagios var]# /usr/bin/webinject.pl -c /usr/local/nagios/libexec/webinject/nagios.xml /usr/local/nagios/libexec/webinject/shdtest.xml -s URL=10.13.25.93 -s CLIENT=train0021 -s LOGIN=TR.monitoring
WebInject OK - All tests passed successfully in 0.239 seconds|time=0.239s;0;20;0;0 case1=0.137s;0;0;0;0 case2=0.007s;0;0;0;0
[root@vmmgmtappnagios var]#
The other app does not:
[root@vmmgmtappnagios var]# /usr/bin/webinject.pl -c /usr/local/nagios/libexec/webinject/nagios.xml /usr/local/nagios/libexec/webinject/shdtest.xml -s URL=10.13.25.83 -s CLIENT=train0021 -s LOGIN=TR.monitoring
WebInject CRITICAL - case #2: Test case number 1 failed
Test: /usr/local/nagios/libexec/webinject/shdtest.xml - 1
Desc: Check Home
GET Request: http://10.13.25.83:9080/RNJ/action/Home
Failed - No valid HTTP response:
500 read timeout
Content-Type: text/plain
Client-Date: Wed, 05 Feb 2014 16:47:53 GMT
Client-Warning: Redirect loop detected (max_redirect = 0)
read timeout at /usr/lib/perl5/site_perl/5.8.8/Net/HTTP/Methods.pm line 257.
Verify: 'var baseActionUrl = 'https://train0021.xxxxxxxx/RNJ/action';'
Failed Positive Verification
TEST CASE FAILED
Response Time = 180.07 sec
-------------------------------------------------------
Test: /usr/local/nagios/libexec/webinject/shdtest.xml - 2
Desc: Logout
GET Request: http://10.13.25.83:9080/RNJ/action/Logout
Failed - No valid HTTP response:
500 read timeout
Content-Type: text/plain
Client-Date: Wed, 05 Feb 2014 16:50:53 GMT
Client-Warning: Redirect loop detected (max_redirect = 0)
read timeout at /usr/lib/perl5/site_perl/5.8.8/Net/HTTP/Methods.pm line 257.
TEST CASE FAILED : case #2: Test case number 1 failed
Response Time = 180.016 sec
-------------------------------------------------------
Test Cases Run: 2
Test Cases Passed: 0
Test Cases Failed: 2
Verifications Passed: 0
Verifications Failed: 3
|time=360.219s;0;20;0;0 case1=180.07s;0;0;0;0 case2=180.016s;0;0;0;0
[root@vmmgmtappnagios var]#
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: webinject erroring out for service and timing out
Looks like your answer may be in your plugin output, it might be trying to verify with credentials that do not work..Failed Positive Verification
TEST CASE FAILED
Response Time = 180.07 sec
-
vivithemage
- Posts: 102
- Joined: Tue May 21, 2013 2:52 pm
Re: webinject erroring out for service and timing out
Do you know where the plugin output generally goes?slansing wrote:Looks like your answer may be in your plugin output, it might be trying to verify with credentials that do not work..Failed Positive Verification
TEST CASE FAILED
Response Time = 180.07 sec
I tried logging in with me as the user and not our monitoring user and it still timed out. I have logged into the actual web application of this server as me without a problem too.
-
vivithemage
- Posts: 102
- Joined: Tue May 21, 2013 2:52 pm
Re: webinject erroring out for service and timing out
I ran a telnet, and a GET // command and it failed, so I ended up restarting the application and nagios is not alerting anymore.
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: webinject erroring out for service and timing out
Restarting what application? Did you restart nagios, the server nagios is on, or the remote host? The get and telnet failed to the remote host? That is indicative of a network issue residing outside the point of origin at which the command was ran.
-
vivithemage
- Posts: 102
- Joined: Tue May 21, 2013 2:52 pm
Re: webinject erroring out for service and timing out
I restarted nagios to see, but it still did not work.slansing wrote:Restarting what application? Did you restart nagios, the server nagios is on, or the remote host? The get and telnet failed to the remote host? That is indicative of a network issue residing outside the point of origin at which the command was ran.
I restarted the application that was working as an application, but failing my nagios test and my manual GET // request via telnet. I did a telnet from nagios to the remote server/app.
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: webinject erroring out for service and timing out
Okay, so I think what you are trying to say is that your GET request did not go through either? So it sounds like it is an issue with the remote host.