Page 1 of 1

HTTP - CRITICAL - Socket timeout after 10 seconds

Posted: Tue Mar 22, 2016 12:50 pm
by xxxvii
I have a situation where I'm monitoring both the HTTP and HTTPS services on a host. For some reason which I'm finding hard to explain, the HTTP service is returning the following error: "CRITICAL - Socket timeout after 10 seconds." The HTTPS service, however, is reporting OK.

Based on a few posts in this forum, I tried extending the timeout period to 20, 30, even 60 seconds (e.g., -t 60) in my service definition, but I kept getting the socket timeout message. When I tried setting it to 120 seconds, the error persisted, but with this new message: "Service check timed out after 60.03 seconds."

Curiously, I'm also monitoring a specific web page, using the check_http command too, and that one is reporting OK.

Below you'll find snippets of my original code. If you need anything else, let me know. Thanks!

Code: Select all

define host{
        name                            generic-host    ; The name of this host template
        notifications_enabled           1               ; Host notifications are enabled
        event_handler_enabled           1               ; Host event handler is enabled
        flap_detection_enabled          1               ; Flap detection is enabled
        process_perf_data               1               ; Process performance data
        retain_status_information       1               ; Retain status information across program restarts
        retain_nonstatus_information    1               ; Retain non-status information across program restarts
        notification_period             24x7            ; Send host notifications at any time
        register                        0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
        }

Code: Select all

define host{
        name                    windows-host-pub  ; The name of this host template
        use                     generic-host    ; Inherit default values from the generic-host template
        check_period            24x7            ; By default, Windows servers are monitored round the clock
        check_interval          5               ; Actively check the server every 5 minutes
        retry_interval          1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts      10              ; Check each server 10 times (max)
        notification_period     24x7            ; Send notification out at any time - day or night
        notification_interval   30              ; Resend notifications every 30 minutes
        notification_options    d,r             ; Only send notifications for specific host states
        contact_groups          admins          ; Notifications get sent to the admins by default
        register                0               ; DONT REGISTER THIS - ITS JUST A TEMPLATE
        }

Code: Select all

define hostgroup{
        hostgroup_name          web-hosts-pub
        alias                   Web Hosts beyond OZ
}

define service{
        use                     generic-service         ; Inherit default value$
        hostgroups              web-hosts-pub
        servicegroups           webservices
        service_description     HTTP
        check_command           check_http! -H $HOSTADDRESS$ -I $HOSTADDRESS$ -e HTTP/1.
        }

define service{
        use                     generic-service         ; Inherit default value$
        hostgroups              web-hosts-pub
        servicegroups           webservices
        service_description     HTTPS
        check_command           check_http! -H $HOSTADDRESS$ -I $HOSTADDRESS$ -S -e HTTP/1.
        }

Code: Select all

define host{
        host_name               hostx
        alias                   Host X
        address                 111.111.111.11  ; Fictitious IP address for the purpose of this post
        hostgroups              web-hosts-pub
        use                     windows-host-pub
}

define service{
        use                     generic-service         ; Inherit default value$
        host_name               hostx
        service_description     Host X Web Page
        check_command           check_http! -H oururl.com -u "/xxx-yyy/aaa-bbb.page?execution=e1s1" -S
}


Re: HTTP - CRITICAL - Socket timeout after 10 seconds

Posted: Tue Mar 22, 2016 1:31 pm
by bwallace
For the check on the HTTP host/service, try dropping the '-S' from the check command so it looks like this

Code: Select all

check_http! -H oururl.com -u "/xxx-yyy/aaa-bbb.page?execution=e1s1"
Do this manually from the CLI 1st, before editing the corresponding file.

Re: HTTP - CRITICAL - Socket timeout after 10 seconds

Posted: Wed Mar 23, 2016 6:46 am
by xxxvii
Thanks, but that is not the service definition that's causing the issue, but rather this one:

Code: Select all

define service{
        use                     generic-service         ; Inherit default value$
        hostgroups              web-hosts-pub
        servicegroups           webservices
        service_description     HTTP
        check_command           check_http! -H $HOSTADDRESS$ -I $HOSTADDRESS$ -e HTTP/1.
        }

Re: HTTP - CRITICAL - Socket timeout after 10 seconds

Posted: Wed Mar 23, 2016 8:19 am
by xxxvii
bwallace wrote:For the check on the HTTP host/service, try dropping the '-S' from the check command so it looks like this

Code: Select all

check_http! -H oururl.com -u "/xxx-yyy/aaa-bbb.page?execution=e1s1"
As a follow-up to my previous post, I decided to try this just for kicks and now I am getting the "CRITICAL - Socket timeout after 10 seconds" status for this service as well; just by removing the -S.

So, to reiterate, HTTPS checks work fine, but HTTP checks don't.

Re: HTTP - CRITICAL - Socket timeout after 10 seconds

Posted: Wed Mar 23, 2016 9:35 am
by rkennedy
Can you post the command definition for check_http in your environment? You may be passing -H twice, which is causing an issue.

From the CLI, can you run /usr/local/nagios/libexec/check_http -H oururl.com -u "/xxx-yyy/aaa-bbb.page?execution=e1s1" and get a proper result?