Flapping
Flapping
Hi all,
I was using Nagios 3.5.0, but after i try to update and to get it back in normal function, i successfully destroy all the OS. )) So i decide to uninstall and re install, from scratch. But the .cfg files i was able to backup and save them. So i install new Ubuntu Server 12.04.03 LTS and Nagios 3.5.1 with Plugins 1.4.15. After the successful installation, some .cfg files was added ( in nagios/etc/objects ) like routers.cfg where i have put all the routers. And i delete the template.cfg file and add my own ... and also with some other .cfg files was deleted and copied from old one or just added ( when i add i also put the privilege that is needed for nagios ... chown nagios:nagios and chmod ... ), and it works ok, but from time to time i have shown that all hosts ( not just routers, all hosts ) in the status area show me that they are down and services all up, and some are ok. And for some times all comes to normal. I was using Nagios 3.5.0 enough time, but i have never see this kind of situation where all hosts are down and services up.
I was using Nagios 3.5.0, but after i try to update and to get it back in normal function, i successfully destroy all the OS. )) So i decide to uninstall and re install, from scratch. But the .cfg files i was able to backup and save them. So i install new Ubuntu Server 12.04.03 LTS and Nagios 3.5.1 with Plugins 1.4.15. After the successful installation, some .cfg files was added ( in nagios/etc/objects ) like routers.cfg where i have put all the routers. And i delete the template.cfg file and add my own ... and also with some other .cfg files was deleted and copied from old one or just added ( when i add i also put the privilege that is needed for nagios ... chown nagios:nagios and chmod ... ), and it works ok, but from time to time i have shown that all hosts ( not just routers, all hosts ) in the status area show me that they are down and services all up, and some are ok. And for some times all comes to normal. I was using Nagios 3.5.0 enough time, but i have never see this kind of situation where all hosts are down and services up.
-
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Flapping
What do the comments say on those hosts? And what is their status information?
Re: Flapping
Here it is ....
It tells that hosts status is down but the real status is up .. i can access that host
and i dont known how can be active and passive checks enable ... should be one of them disable ???
It tells that hosts status is down but the real status is up .. i can access that host
and i dont known how can be active and passive checks enable ... should be one of them disable ???
-
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Flapping
Why are you using /bin/ping? You should be using the check_ping plugin from the plugins package you installed, and that should be located in /usr/local/nagios/libexec assuming you are running Centos/RHEL. I'd recommend trying the check_ping I mentioned above from the command line and seeing if you get the same output as it shows in the web interface, if you get a valid response that should be your solution.
Re: Flapping
I assuming that the ping was doing something .... i remove it from the host but they are still showing that is down and status information showing about the same error about /bin/ping ... and i dont know way it is using /bin/ping command i dont change nothing ... im using default settings and have nothing change ... so i found the ping command in the cgi.cfg file where point to /bin/ping (ping_syntax=/bin/ping -n -U -c 5 $HOSTADDRESS$) so how to change this now ... should i put just "ping_syntax=/$USER1$/check_ping or to put all path /usr/local/nagio/libexec" ??? i dont know who to change this ... and im using Ubuntu Server 12.04.03 LTs
-
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Flapping
You should already have a command defined that calls the ping check as this is a basic feature of nagios. Look inside your commands.cfg and you will find either check_ping or check_ICMP being used for the command definition "check-host-alive" , I would call that command for your host definitions so that they are used for the host-alive check.
Re: Flapping
I have those commands in the commands.cfg files
Now do i have to comment ( put # ) the /bin/ping command in the cgi.cfg file ? And yes i have the "check_ping" and "check_icmp" in libexec directory ..
Code: Select all
################################################################################
#
# SAMPLE HOST CHECK COMMANDS
#
################################################################################
# This command checks to see if a host is "alive" by pinging it
# The check must result in a 100% packet loss or 5 second (5000ms) round trip
# average time to produce a critical error.
# Note: Five ICMP echo packets are sent (determined by the '-p 5' argument)
# 'check-host-alive' command definition
define command{
command_name check-host-alive
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
}
# 'check-host-alive 2' command definition
define command{
command_name check_icmp
command_line $USER1$/check_icmp -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 2
}
-
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Flapping
Then you should be calling check-host-alive as the command definition for your hosts and not trying to call /bin/ping. It looks like you have the RTA and packetloss % hardcoded into the command as well, you could switch this to use Arguments in case you need to alter it for systems with traditionally high network latency.
Re: Flapping
Im not trying to call /bin/ping ... im using check-host-alive .... now in this current situation im not using ping at all, and still showing me flapping and /bin/ping error. Here is what iv got for ping command and it is comment
I just install the Nagios and copy the router.cfg, template.cfg and that is all ( i put the right privilege and all needed staff ) i have not change any of the parameters ...
Code: Select all
#define service{
# use generic-service
# hostgroup_name router-bp
# service_description PING
# check_command check_ping!200.0,20%!600.0,60%
# normal_check_interval 5
# retry_check_interval 1
#}
Im not sure that i understand you here ... pls explain to me ....It looks like you have the RTA and packetloss % hardcoded into the command as well, you could switch this to use Arguments in case you need to alter it for systems with traditionally high network latency.
I just install the Nagios and copy the router.cfg, template.cfg and that is all ( i put the right privilege and all needed staff ) i have not change any of the parameters ...
-
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Flapping
Sorry, I was talking about your warning and critical thresholds, your Round Trip Average time is the first number and is comma separated from the packet loss percentage.
Can you call check_ping from the command line manually and see if you receive the /bin/ping error? I'm not sure why the host you showed is showing "/bin/ping -n -U -w 30 -c 5 172.16.20.251" Unless that is how you have it defined in the command. Which host's details did you show below in the picture? Can you share the config file section for it?
Can you call check_ping from the command line manually and see if you receive the /bin/ping error? I'm not sure why the host you showed is showing "/bin/ping -n -U -w 30 -c 5 172.16.20.251" Unless that is how you have it defined in the command. Which host's details did you show below in the picture? Can you share the config file section for it?