Page 1 of 3

[SOLVED] Starting with Nagios, monitorred site down

Posted: Thu May 23, 2013 3:32 am
by Toontje
Hi all!

Just started monitoring my external websites using Nagios3.

I am currently monitoring appx 15 websites, many hosted on the same hosting platform. I am using host available and check_http to check availability of the sites.
5 of the 15 sites show:

Code: Select all

Host Status:	  DOWN   (for 0d 1h 36m 19s)
Status Information:	CRITICAL - Network Unreachable
where check_http shows:

Code: Select all

HTTP OK: HTTP/1.1 200 OK - 18682 bytes in 0.252 second response time 
When i ping those sites from the Nagios server i get a correct response. No sign of the sites or networks being unreachable.

What is happening here? Again, this happens to 5 of the 15 sites i monitor hosted on the same external hosting platform.

Ton.

Re: Starting with Nagios, monitorred site down when it's not

Posted: Thu May 23, 2013 12:43 pm
by abrist
What are you setting for the hostname/address portion of your configs?
Are you successfully pinging the hosts from the nagios server?

Re: Starting with Nagios, monitorred site down when it's not

Posted: Thu May 23, 2013 1:59 pm
by Toontje
Example host definition:

Code: Select all

define host{
        use                    	website            ; Name of host template to use
        host_name               www.machielsen.net
        alias                   www.machielsen.net
        address                 www.machielsen.net
        }
Then, when i ping the host:

Code: Select all

/etc/nagios3/objects/hosts/websites$ ping www.machielsen.net
PING www.machielsen.net (217.160.225.132) 56(84) bytes of data.
64 bytes from clienteservidor.es (217.160.225.132): icmp_req=1 ttl=49 time=49.3 ms
64 bytes from clienteservidor.es (217.160.225.132): icmp_req=2 ttl=49 time=54.9 ms
64 bytes from clienteservidor.es (217.160.225.132): icmp_req=3 ttl=48 time=51.0 ms
64 bytes from clienteservidor.es (217.160.225.132): icmp_req=4 ttl=49 time=87.0 ms

Re: Starting with Nagios, monitorred site down when it's not

Posted: Thu May 23, 2013 4:52 pm
by abrist
Can we see the service/command definition for host available and the "website" template?

Re: Starting with Nagios, monitorred site down when it's not

Posted: Fri May 24, 2013 7:16 am
by Toontje

Code: Select all

# check that web services are running
define service {
        hostgroup_name                  websites
        service_description             Check Web
 	check_command                   check_http
        use                             generic-service
	notification_interval           0 ; set > 0 if you want to be renotified
}

Re: Starting with Nagios, monitorred site down when it's not

Posted: Fri May 24, 2013 10:47 am
by abrist
Looks like that is the service definition. Could you by chance post the "website" template as well?

Re: Starting with Nagios, monitorred site down when it's not

Posted: Fri May 24, 2013 11:23 am
by Toontje

Code: Select all

# Generic host definition template - This is NOT a real host, just a template!

define host{
        name                            website    ; The name of this host template
	parents				Ono_Router
        notifications_enabled           1       ; Host notifications are enabled
        event_handler_enabled           1       ; Host event handler is enabled
        flap_detection_enabled          1       ; Flap detection is enabled
        failure_prediction_enabled      1       ; Failure prediction is enabled
        process_perf_data               1       ; Process performance data
        retain_status_information       1       ; Retain status information across program restarts
        retain_nonstatus_information    1       ; Retain non-status information across program restarts
	check_command                   check-host-alive
	max_check_attempts              10
	notification_interval           0
	notification_period             24x7
	notification_options            d,u,r
	contact_groups                  admins
        register                        0       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
        }

Re: Starting with Nagios, monitorred site down when it's not

Posted: Fri May 24, 2013 12:44 pm
by slansing
Did you follow the outlines for creating host/service definitions here?:

http://nagios.sourceforge.net/docs/3_0/ ... tions.html

I would also recommend you verify the config files, let us know if you see any errors returned:

http://nagios.sourceforge.net/docs/3_0/ ... onfig.html

Re: Starting with Nagios, monitorred site down when it's not

Posted: Mon May 27, 2013 2:12 am
by Toontje
slansing wrote:Did you follow the outlines for creating host/service definitions here?:

http://nagios.sourceforge.net/docs/3_0/ ... tions.html
As far as i can see, yes.
I would also recommend you verify the config files, let us know if you see any errors returned:

http://nagios.sourceforge.net/docs/3_0/ ... onfig.html
Returned no errors. Just one warning about a service (check_http) already previously defined in a host configuration.

Re: Starting with Nagios, monitorred site down when it's not

Posted: Tue May 28, 2013 12:51 pm
by abrist
check-host-alive is most likely using check_ping. Verify that you can actually use this plugin to check the website:

Code: Select all

cd /usr/local/nagios/libexec
 ./check_ping -H  www.machielsen.net -w 200,50% -c 500,100%