Page 1 of 3
Incorrect report
Posted: Fri Nov 01, 2013 1:24 am
by chitrangada
Hi
I have installed Nagios core 3 and added services wrt hosts. It is generating different result for different services on the same domain. I have set check_http and check_ping services on a domain but getting different uptime status for both the services. According to the Nagios observation, the site is DOWN for check_http service and UP for check_ping service. Can you please tell me why this is happening??
There is another case that host is UP but service is DOWN and vice versa. Can you please explain this behavior?
Thanks
Re: Incorrect report
Posted: Fri Nov 01, 2013 12:25 pm
by tmcdonald
Just to make sure, you do in fact have a web server configured correctly on the server that check_http is checking? You can view it in a web browser?
As for a host being up and a service being down, this is normal. The confusing part is when a host is DOWN but a service on that host is UP. This is usually a misconfigured check or a host that is being checked with check_ping and the OS blocks ping requests. In that case a service on that host will report as working and the host will appear down, but really is up.
Re: Incorrect report
Posted: Wed Nov 13, 2013 3:55 am
by chitrangada
Hi,
Could you please explain me why i am getting a DOWN or flapping status for a domain which is UP ? For me, if a domain is accessible on browser then it is in UP status.
Please mention here if special care needs to be taken while creating check on hosts because i have referenced your document to define objects.
Thanks in advance
Re: Incorrect report
Posted: Wed Nov 13, 2013 12:32 pm
by sreinhardt
Can you run the same command for the service that is down, from the command line and see what the result it? It would also help if we could see the command and service definitions you are having issues with.
Re: Incorrect report
Posted: Sat Nov 16, 2013 1:15 am
by chitrangada
Hi,
I am getting this message 'CRITICAL - Socket timeout after 10 seconds' when ran this command '/usr/lib/nagios/plugins/check_http -I 188.95.227.20' and below are the host and service definitions:
from host.cfg:
define host {
host_name fredriknas.se
address 188.95.227.20
max_check_attempts 5
check_interval 5
retry_interval 1
notifications_enabled 0
icon_image_alt Linux
icon_image base/linux40.gif
statusmap_image base/linux40.gd2
check_command check-host-alive
check_period 24x7
notification_period 24x7
contact_groups +admins
use dt_host
}
From host_templates.cfg
define host {
name dt_host
register 0
max_check_attempts 5
check_interval 5
retry_interval 1
notification_interval 30
notification_options d,u,r
notifications_enabled 0
check_period 24x7
notification_period 24x7
check_command check-host-alive
use generic-host
}
service.cfg template:
define service {
service_description check_http
check_command check_http!
host_name fredriknas.se
check_period 24x7
notification_period 24x7
contact_groups +admins
max_check_attempts 5
check_interval 5
retry_interval 1
notifications_enabled 1
use dt_service
}
From service_templates.cfg
define service {
name dt_service
register 0
max_check_attempts 5
check_interval 5
retry_interval 1
notification_interval 0
notification_options w,u,c,r,s
active_checks_enabled 1
passive_checks_enabled 1
notifications_enabled 1
check_freshness 0
check_period 24x7
notification_period 24x7
use generic-service
}
Re: Incorrect report
Posted: Mon Nov 18, 2013 11:04 am
by slansing
Do you not have any warning or critical thresholds defined for that service check? A flapping state will only occur when you have flapping checking enabled, and the host/service changes states rapidly thus throwing the status to flapping. You are getting a timeout when running check_http to that host, which means you cant contact it, or you have the check incorrectly defined, and in turn you are getting a critical response.... Try just running the plugin script itself, it should output usage text and tell you what is required to make it functional.
Re: Incorrect report
Posted: Sat Nov 23, 2013 7:43 am
by chitrangada
slansing wrote:Do you not have any warning or critical thresholds defined for that service check?
Could you please tell me how to set warning thresholds for a service? Is this and 'freshness_threshold' same?
slansing wrote:Try just running the plugin script itself, it should output usage text and tell you what is required to make it functional.
As i mentioned the output of check_http command when ran it manually in command line. I again tried with more options, here are my trials with the command:
/usr/lib/nagios/plugins/check_http -I 188.95.227.20 -p 80 -t 30
/usr/lib/nagios/plugins/check_http -I 188.95.227.20 -t 30
/usr/lib/nagios/plugins/check_http -I 188.95.227.20 -t 30 -C 60
/usr/lib/nagios/plugins/check_http -I 188.95.227.20 -t 30 -C 60,60
BUT got same output for all the above trials:
Connection timed out
HTTP CRITICAL - Unable to open TCP socket
Re: Incorrect report
Posted: Mon Nov 25, 2013 11:19 am
by sreinhardt
Well since all of those are still showing as down even with extended timeouts. Let's try some nmap and verify that port is actually open. Also I should note that you do not specify -H for the host it should connect to.
Code: Select all
/usr/lib/nagios/plugins/check_http -H 188.95.227.20 -p 80 -t 30
/usr/lib/nagios/plugins/check_http -H 188.95.227.20 -t 30
/usr/lib/nagios/plugins/check_http -H 188.95.227.20 -t 30 -C 60
/usr/lib/nagios/plugins/check_http -H 188.95.227.20 -t 30 -C 60,60
Re: Incorrect report
Posted: Fri Dec 27, 2013 5:20 am
by chitrangada
Hi,
Sorry for late reply.
Here are the outputs for all the commands:
# /usr/lib/nagios/plugins/check_http -H 188.95.227.20 -t 30 -C 60,60
Connection timed out
HTTP CRITICAL - Unable to open TCP socket
# /usr/lib/nagios/plugins/check_http -H 188.95.227.20 -t 30 -C 60
Connection timed out
HTTP CRITICAL - Unable to open TCP socket
---------------------------------------------------------
# nmap -p 80 188.95.227.20
Starting Nmap 5.00 (
http://nmap.org ) at 2013-12-27 11:25 CET
Note: Host seems down. If it is really up, but blocking our ping probes, try -PN
Nmap done: 1 IP address (0 hosts up) scanned in 3.21 seconds
---------------------------------------------------------
# nmap -p 80 188.95.227.20 -PN
Starting Nmap 5.00 (
http://nmap.org ) at 2013-12-27 11:30 CET
Interesting ports on atapache.citynetwork.se (188.95.227.20):
PORT STATE SERVICE
80/tcp filtered http
Nmap done: 1 IP address (1 host up) scanned in 2.19 seconds
Now could you please help me to get rid of this?
Thanks in advance.
Re: Incorrect report
Posted: Fri Dec 27, 2013 11:54 am
by abrist
The nmap scan is reporting that port 80 on the host 188.95.227.20 is filtered:
chitrangada wrote:80/tcp filtered http
1. Is the host 188.95.227.20 a windows or linux box?
2. Is there a router/firewall between your nagios server and the 188.95.227.20 host?
3. Is a web server actually running on the remote host 188.95.227.20?