Thanks for that solution; I'll give that a shot, but was trying to avoid adding any services manually (or via the CLI) - just want to add the hosts because I want them to inherit the services assigned to the predefined host group.
Let me try to do a better job of explaining of what happens here so you can see what I'm up against and maybe give a little more clarity to what I'm trying to accomplish (you might want to first get a sandwich or a cup of coffee, as this will probably be a bit lengthy):
BACKGROUND
We have developers that don't push any new code to an existing production server (that's another story for another day, but I digress). During an implementation, the developers will spin up new replacement servers and put those new servers into production; requesting us to remove the old servers once the new servers are actively in Nagios.
Once we receive the request to add new hosts, we pick a single "old" server and do a bulk clone using the host names of the new servers that were just spun up (the "old" server is already in a previously defined host group which has inherited the service checks).
Once the new servers are active in Nagios, we now need to remove the "old" hosts and services from Nagios. We typically go in and bulk delete the services associated with each host (sometimes as many as 20 services per host) and then delete the host(s). For one or two servers, this doesn't totally suck, but it is rather inconvenient (now if Nagios would just let me delete a host and didn't give a rip about the services tied to the host, this part would certainly suck less). Doing it this way, typically, will cause us to do 2 - 3 configuration applies. This adding / deleting of services and hosts happens several times a day.
Occasionally, the developers will want some additional service or drive monitored on their hosts. We then have to go in and add that service check to each host (either manually, or occasionally we can clone the service, depending on the number of services or servers to add that service check to). This happens several times a month.
As you can see, this is a total pain in the a$$.
HOW I'M TRYING TO FIX IT
To make this work much more logically (and suck less), I grouped a set of hosts into host groups. That way, I can add a new host to an existing host group and have it inherit the checks associated with that host group. To do that, I did this:
Create a host group called tomcat.
Create a host called tomcat and assigned services to that host.
Assign tomcat the host to the host group called tomcat.
Create a host group called apache_lb.
Create a host called apache_lb and assigned services to that host.
Assign apache_lb the host to the host group called apache_lb.
create a host group called product_api.
Assign the tomcat host group to product_api.
Assign the apache_lb host group to product_api.
So now, when I want to report on the product_api hosts, it includes all the tomcat/apache_lb hosts in one host group.
WHAT I'M TRYING TO ACCOMPLISH
The developers have a script that generates a configuration file for each host when it is spun up in VCAC. That script uses it's Chef data bag name to create the host group. The configuration file looks like this:
Code: Select all
define host {
host_name myprodhost-1455.example.com
address 10.10.10.1
use xiwizard_linuxserver_host
hostgroups production_api-apache_ws
max_check_attempts 5
check_interval 5
retry_interval 1
check_period xi_timeperiod_24x7
contacts noc_folks
notification_interval 60
notification_period xi_timeperiod_24x7
icon_image redhat.png
statusmap_image redhat.png
_xiwizardlinux-server register 1
}
That configuation file gets copied to the Nagios host. The directory gets checked every 30 minutes. If a new configuration file is found in that directory, the script checks the host group database to verify that host group actually exists. If it does, the config file is copied to /usr/local/nagios/etc/import and an email notification is sent to the NOC who can apply the configuration.
If that host group does not exist (which happens occasionally, as the developers may create a new host group on the fly), we move the configuration file to a holding area and notify the NOC that the host group does not exist in Nagios. The NOC creates that new host group and the config file is then moved back into the holding directory for processing at the next half hour.
MY QUESTION
Is a way to add the host to the host group without having to add additional checks (since they already exist). I thought that if I was just trying to add the host to the (existing) host group, it would just add the host and wouldn't require me to add an additional check.
Does that make this any clearer or make more sense?