Page 1 of 1
Ping across subnets
Posted: Thu Aug 15, 2013 11:56 am
by jjwhite
Infrastructure Drawing August 22 2012.jpg
We have Core 3.5.0 installed and have some basic ping configs working fine. We have just purchased XI and will be installing it in the next couple of days. Running CentOS latest version on new dedicated PCs.
The issue we have is how to ping across subnets. I have uploaded our network config diagram. The Nagios boxes go at remote radio sites (Kiowa and Hilltop). We need to ping the local devices (IP223's, switch, router, Netbotz) and devices at the other end of the T1's which are in the 10.1.4.x subnet. The local devices at each site are all 10.1.5.x. The CISCO routers provide T1 connections between sites. They have route tables to route traffic from DSCO to Kiowa and to Hilltop in both ways. So with a laptop plugged in at Kiowa set to 10.1.5.x subnet 255.255.255.0 and gateway 10.1.5.19 I can ping 10.1.4.7 across the T1 at DCSO.
Additionally, the Netgear router at Kiowa, 10.1.5.1, provides WAN internet access for Nagios alerts. We also have port forwarding set up in those to allow administration from the WAN of Nagios, the CentOS, and the Netbotz.
The question is, how do we set up CentOS and/or Nagios config files to allow Nagios to
1) ping 10.1.5.x devices
2) ping 10.1.4.x devices through the CISCOs
3) send alerts via the 10.1.5.1 router
At the moment we have two Nagios boxs and some switches and routers set up in our lab so we can figure out this last question, then will deploy them to the sites. I have attached a diagram of that setup as well.
Jim
Re: Ping across subnets
Posted: Thu Aug 15, 2013 12:59 pm
by abrist
If XI has routes to those subnets, it should be able to ping them. If icmp is restricted, or if the routes are locked down, you will have use one of the following options:
1. Passive checks - the remote hosts will report in below the freshness check intervals. When a host does not report passively within the freshness interval, the host will be marked as down.
2. Remote proxy agent - most likely nrpe. Setup a remote nrpe server on each subnet. Forward port 5666 from the gateway of each subnet to the nrpe server. Submit ping checks through nrpe to the remote nrpe server to check against hosts on the subnet.
3. Remote Nagios core or XI install that can check hosts and services on the subnet and push the results to the central XI server.
4. Something else?
Re: Ping across subnets
Posted: Tue Aug 20, 2013 6:31 pm
by jjwhite
Thanks. We established a route and can now ping those 10.1.4 addresses.
Next question.
What is the complete solution to deleting hosts and services and starting over with an empty config so we can build the config from scratch? Some old hosts show up in the dashboard but do not show up in config manager. Going to their details through the dashboard and using delete does not work.
We have deleted all files from the xi hosts and services folders but these both old and new hosts still show up in the dashboard (and send alerts).
We need the steps to get us to a clean start.
Re: Ping across subnets
Posted: Wed Aug 21, 2013 9:25 am
by slansing
If they are not present in the CCM "Mysqld Database Copy" They are most likely either cached somehow and you are retaining old results, or their flat config file is still present on the XI server. To check to see if they are "Ghost Hosts" aka retained flat config files please follow this post:
http://support.nagios.com/wiki/index.ph ... t_Hosts.29
Re: Ping across subnets
Posted: Wed Aug 21, 2013 3:24 pm
by jjwhite
We read that post and followed it's sugestions. Here are the results.
Write host configurations ...
Host configuration files successfully written!
Write service configurations ...
Service configuration files successfully written!
Configuration file: hostgroups.cfg successfully written!
Configuration file: servicegroups.cfg successfully written!
Configuration file: hosttemplates.cfg successfully written!
Configuration file: servicetemplates.cfg successfully written!
Configuration file: timeperiods.cfg successfully written!
Configuration file: commands.cfg successfully written!
Configuration file: contacts.cfg successfully written!
Configuration file: contactgroups.cfg successfully written!
Configuration file: contacttemplates.cfg successfully written!
Configuration file: servicedependencies.cfg successfully written!
Configuration file: hostdependencies.cfg successfully written!
Configuration file: serviceescalations.cfg successfully written!
Configuration file: hostescalations.cfg successfully written!
Configuration file: serviceextinfo.cfg successfully written!
Configuration file: hostextinfo.cfg successfully written!
Did a verify and: verify failed: no services defined, not hosts defined.
Went to the monitoring menu: hosts: no hosts listed
Clicked on Home: details: Host detail: several hosts listed
Service detail : several listed
Clicked on a host in the host detail view, went to Configure tab, click delete, get the message "could not find a uniquie id for this host. Host cannot be deleted using this method"
Checked these directories and they are empty
usr/local/nagios/etc/hosts
usr/local/nagios/etc/services
Could the host and services cfg files be somewhere else?
Re: Ping across subnets
Posted: Wed Aug 21, 2013 3:30 pm
by slansing
Those would be the only places, unless you are using static hosts/services which would be in the following directory:
If nothing is there, I'd recommend trying to restore to an old snapshot via Admin > Config Snapshots.
Re: Ping across subnets
Posted: Fri Sep 13, 2013 6:46 pm
by jjwhite
We ended up wiping the machine and starting over, XI fresh install, using the host config GUI's to set up hosts. that went well. Very easy to set up simple hosts that way.
We have one site on a satellite internet connection with a normal latency of up to about 1.2 seconds for a ping. XI is reporting that one down and up occasionally, I think just due to ping time.
I recall with nagios core building the config text files manually there was a string that let you set some parameters of the ping. I don't see those in the core config manager gui.
Is it possible to set a long allowable delay for a ping, and if so how do we do that in XI?
Jim
Re: Ping across subnets
Posted: Mon Sep 16, 2013 9:35 am
by abrist
Yes. As the ping checks are effecting the host status of the objects, I am going to presume that it is the host-keep-alive check that needs to be altered. These are normally setup through a template (from our wizards). You could change the template's $ARGn$ values, or configure the check on the object itself.
Template:
Go to --> CCM --> Host Templates --> xiwizard_generic_host.
Change the $ARGn$ values to your liking.
Or, configure on the object itself:
Go to --> CCM --> Hosts --> Click on the host in question.
Add the check command "keep-host-alive" in the check command drop-down.
Add the following $ARGn$s, and alter them as you see fit:
$ARG1$: 3000.0
$ARG2$: 80%
$ARG3$: 5000.0
$ARG4$: 100%
You may have other issues, as a host will not be alerted as down until it has reached a HARD state. The default templates will not do this until 5 1 minute retries have been attempted. This means that your hosts are failing to respond to pings for 5 or more minutes straight.
Re: Ping across subnets
Posted: Tue Sep 17, 2013 4:21 pm
by jjwhite
Thanks. We were able to get that going and have had no more false alerts. We had to select check_ping then fill in the arguments with
3000.0,80%
and
3000.0,100%
To get the config to be accepted. (We don't differentiate between levels of 'down' at this point).
Related question.
We built our hosts from a clean install using the config hosts wizard, which is very easy to use. We are using ping to monitor all hosts at this time (all are routers or switches). But when we select CCM, then the Common Settings tab, there is nothing populated in the Check Command field.
So we selected check_ping then filled in the arguments.
Why doesn't check_ping (or some other flavor of ping?) show up in that field when the wizard has been used to set up a host?
Jim
Re: Ping across subnets
Posted: Tue Sep 17, 2013 4:30 pm
by abrist
This is due to templating. All our wizards use some form of templates. Host are more often than not set up with the 'xiwizard_generic_host' template which includes a keep alive check. This check is inherited by the object through the template, though any settings configured directly on the host object will override the inherited settings (as your ping check did).