Faced issue with check_http
Faced issue with check_http
We recently faced an issue where our payment provider service was returning 502 error code , but Nagios was not able to detect and report any errors.
Here are the logs application logs for that time
2015-04-12 13:44:02.410 INFO [main] [com.digi.ecommerce.cronjob.fulfillment.logic.finance.CheckPaymentStatus] (CheckPaymentStatus.java:257) - Error Number = 50
Error Category = null
Exception =>
java.io.IOException: Server returned HTTP response code: 502 for URL: https://www.xxxxx.com/epayment/webservi ... ryCardInfo
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:234)
Defined service is -
define service{
use generic-service
host_name localhost
service_description xxxxx.com
servicegroups connectivity
check_command check_custom_http!-H www.xxxxx.com -u "/epayment/webservice/TxInquiryCardDetails/TxDetailsInquiry.asmx/TxDetailsInquiryCardInfo?MerchantCode=yyy&ReferenceNo=zzzz&Amount=2165.00&Version=2" -S -w3 -c5
}
# 'check_custom_http' command definition
define command{
command_name check_custom_http
command_line $USER1$/check_http $ARG1$
}
Can you please suggest what could be the reason of alert not firing ? any debugging steps which we can do to identify the problem ?
Here are the logs application logs for that time
2015-04-12 13:44:02.410 INFO [main] [com.digi.ecommerce.cronjob.fulfillment.logic.finance.CheckPaymentStatus] (CheckPaymentStatus.java:257) - Error Number = 50
Error Category = null
Exception =>
java.io.IOException: Server returned HTTP response code: 502 for URL: https://www.xxxxx.com/epayment/webservi ... ryCardInfo
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:234)
Defined service is -
define service{
use generic-service
host_name localhost
service_description xxxxx.com
servicegroups connectivity
check_command check_custom_http!-H www.xxxxx.com -u "/epayment/webservice/TxInquiryCardDetails/TxDetailsInquiry.asmx/TxDetailsInquiryCardInfo?MerchantCode=yyy&ReferenceNo=zzzz&Amount=2165.00&Version=2" -S -w3 -c5
}
# 'check_custom_http' command definition
define command{
command_name check_custom_http
command_line $USER1$/check_http $ARG1$
}
Can you please suggest what could be the reason of alert not firing ? any debugging steps which we can do to identify the problem ?
-
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: Faced issue with check_http
I assume it's not broken now. Can you recreate the problem somehow, and if so can you curl the URL so we can check out the output?
Re: Faced issue with check_http
According to: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
Is it possible that the 502 error was resolved by the time Nagios checked it next? If you run the command from the CLI, does it return appropriate results?
Is a proper certificate in place for the '-S' to succeed?
check_http should resolve 502 codes as CRITICAL:The server, while acting as a gateway or proxy, received an invalid response from the upstream server it accessed in attempting to fulfill the request.
Code: Select all
./check_http -H www.getstatuscode.com -u "/502"
HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 37518 bytes in 1.619 second response time |time=1.619153s;;;0.000000 size=37518B;;;0
Code: Select all
/usr/local/nagios/libexec/check_http -H www.xxxxx.com -u "/epayment/webservice/TxInquiryCardDetails/TxDetailsInquiry.asmx/TxDetailsInquiryCardInfo?MerchantCode=yyy&ReferenceNo=zzzz&Amount=2165.00&Version=2" -S -w3 -c5
Re: Faced issue with check_http
Was it broken earlier at some point of time ? How can I verify if I am using the latest plugin of Check_http or older one ?jdalrymple wrote:I assume it's not broken now. Can you recreate the problem somehow, and if so can you curl the URL so we can check out the output?
Re: Faced issue with check_http
Issue was persistent for long time . Yes the proper certificates are in place and I am getting below response when I am running the command now -jolson wrote:According to: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
check_http should resolve 502 codes as CRITICAL:The server, while acting as a gateway or proxy, received an invalid response from the upstream server it accessed in attempting to fulfill the request.Is it possible that the 502 error was resolved by the time Nagios checked it next? If you run the command from the CLI, does it return appropriate results?Code: Select all
./check_http -H www.getstatuscode.com -u "/502" HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 37518 bytes in 1.619 second response time |time=1.619153s;;;0.000000 size=37518B;;;0
Is a proper certificate in place for the '-S' to succeed?Code: Select all
/usr/local/nagios/libexec/check_http -H www.xxxxx.com -u "/epayment/webservice/TxInquiryCardDetails/TxDetailsInquiry.asmx/TxDetailsInquiryCardInfo?MerchantCode=yyy&ReferenceNo=zzzz&Amount=2165.00&Version=2" -S -w3 -c5
HTTP OK: HTTP/1.1 200 OK - 1150 bytes in 1.382 second response time |time=1.381566s;3.000000;5.000000;0.000000 size=1150B;;;0
-
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: Faced issue with check_http
You can verify the version of check_http at the command line as I indicated:ankukreja wrote:Was it broken earlier at some point of time ? How can I verify if I am using the latest plugin of Check_http or older one ?
Code: Select all
[jdalrymple@localhost ~]$ /usr/local/nagios/libexec/check_http -V
check_http v2.0.3 (nagios-plugins 2.0.3)
Code: Select all
[jdalrymple@localhost ~]$ /usr/local/nagios/libexec/check_http -H www.example.com -u /
HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 323 bytes in 0.297 second response time |time=0.296953s;;;0.000000 size=323B;;;0
[jdalrymple@localhost ~]$ echo $?
2
Re: Faced issue with check_http
Here is the version which we are using -
check_http v1.4.15 (nagios-plugins 1.4.15)
Can this be the issue ? How to update?
check_http v1.4.15 (nagios-plugins 1.4.15)
Can this be the issue ? How to update?
Re: Faced issue with check_http
ankukreja,
Could you test your version of check_http against this URL to see if it alerts properly?
What is the result of the above? If it resolves as OK, an upgrade may fix the problems that you're having.
Could you test your version of check_http against this URL to see if it alerts properly?
Code: Select all
/usr/local/nagios/libexec/check_http -H www.getstatuscode.com -u "/502"
Re: Faced issue with check_http
That seems to be correct . Thanks for all your help till now
[oracle@xxxxx ~]$ /usr/local/nagios/libexec/check_http -H www.getstatuscode.com -u "/502"
HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 37371 bytes in 2.580 second response time |time=2.580020s;;;0.000000 size=37371B;;;0
[oracle@ecrpt01 ~]$
[oracle@ecrpt01 ~]$
[oracle@ecrpt01 ~]$
[oracle@xxxxx ~]$ /usr/local/nagios/libexec/check_http -H www.getstatuscode.com -u "/503"
HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 37426 bytes in 2.088 second response time |time=2.088210s;;;0.000000 size=37426B;;;0
[oracle@ecrpt01 ~]$
[oracle@xxxxx ~]$ /usr/local/nagios/libexec/check_http -H www.getstatuscode.com -u "/502"
HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 37371 bytes in 2.580 second response time |time=2.580020s;;;0.000000 size=37371B;;;0
[oracle@ecrpt01 ~]$
[oracle@ecrpt01 ~]$
[oracle@ecrpt01 ~]$
[oracle@xxxxx ~]$ /usr/local/nagios/libexec/check_http -H www.getstatuscode.com -u "/503"
HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 37426 bytes in 2.088 second response time |time=2.088210s;;;0.000000 size=37426B;;;0
[oracle@ecrpt01 ~]$
Re: Faced issue with check_http
I had a hunch that this may have to do with your '-S' flag in your check_http command definition - so jdalrymple and I gave it a test.
He set up an https site and made it return a 502, check_http did not pick up on this:
We added the following to the end of the check:
It began working properly.
We think it may have something to do with nginx - but of course there's no way to test this out without a reproduction on your end. The next time you see the site hit a 502 state, you might try tacking '-s /' to the end of your check.
He set up an https site and made it return a 502, check_http did not pick up on this:
Code: Select all
/usr/local/nagios/libexec/check_http -H x.x.x.x -S
HTTP OK: HTTP/1.1 301 Moved Permanently - 391 bytes in 0.121 second response time |time=0.120680s;;;0.000000 size=391B;;;0
Code: Select all
/usr/local/nagios/libexec/check_http -H x.x.x.x -S -s /
HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 323 bytes in 0.052 second response time |time=0.052211s;;;0.000000 size=323B;;;0
We think it may have something to do with nginx - but of course there's no way to test this out without a reproduction on your end. The next time you see the site hit a 502 state, you might try tacking '-s /' to the end of your check.