Service Escalations with Multiple Hosts

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
User avatar
KevinD
Posts: 26
Joined: Thu Mar 29, 2012 10:26 am

Service Escalations with Multiple Hosts

Post by KevinD »

We have come across what appears to be a bug with service escalations, but wanted to get some insight in the hopes that perhaps were just doing it wrong.

We have a relatively new install of NagiosXI on CentOS 5.6 (x64) with one parent, and multiple children fed through dnx.
We have around 40 hosts in there now, each with checks, and each working perfectly fine for notifications from the hosts, and services default contacts.
We have a .com site that has several of the same type of server in each tier (Web/App/DB).

The issue occurs when we set up an escalation.
We can select ALL servers in each tier, and say an HTTPD check as the service, but the only service that gets selected is HTTPD for the first server.
None of the other servers get selected.

So we add an escalation with two hosts (xxxwb1 & xxxwb2) and one service (HTTPD) and we get the following:

Code: Select all

select * from tbl_serviceescalation;
id	config_name	host_name	hostgroup_name	service_description	contacts	contact_groups	first_notification	last_notification	notification_interval	escalation_period	escalation_options	active	last_modified	access_rights	config_id
"45"	"testing"	"1"	"0"	"1"	"1"	"0"	"2"	"2"	"15"	"2"	"c"	"1"	"2012-03-29 10:55:48"	"<NULL>"	"1"

Code: Select all

select * from tbl_lnkServiceescalationToHost where idMaster = 45;
idMaster	idSlave
"45"	"317"
"45"	"318"

Code: Select all

select id, host_name from tbl_host where id in (317, 318);
id	host_name
"317"	"xxxwb1"
"318"	"xxxwb2"
Everything looks good so far, the two hosts we selected are indeed attached to the service escalation...
But now when we look for the service we associated.

Code: Select all

select * from tbl_lnkServiceescalationToService where idMaster = 45;
idMaster	idSlave
"45"	"1784"

Code: Select all

select id, config_name, host_name, service_description from tbl_service where id = 1784;
id	config_name	host_name	service_description
"1784"	"xxxwb0"	"1"	"HTTPD"
And theres the problem...
It picked up the HTTPD service for a different host.
This is HTTPD for xxxwb0, not 1 or 2...
and as a result, when we tested it, it did not escalate for wb1 or wb2, nor wb0 for that matter.

Oh Great and wise nagios Gurus, please show me the error of my ways.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Service Escalations with Multiple Hosts

Post by mguthrie »

We changed how the service escalations work in the UI a few versions ago to allow for escalations to be defined for services that had service->hostgroup assignments. Can access the Core Config Manager->Service escalations page, and click the download button and post the configuration file that gets generated. The Nagios Core engine doesn't use the database at all to determine anything. The Core Config Manager is simply a manager for the configuration files underneath. Nagios compiles down the config definitions into a single host_name:service_description pair.

So even if I have an escalation defined with multiple hosts and multiple services:

Code: Select all

define serviceescalation {
host_name       slashdot.org,host1,host2,host3
service_description      DNS IP Match,DNS,PING
}
It gets compiled down to the following definition in the objects.cache file that the Core engine uses.

Code: Select all

define serviceescalation {
        host_name       slashdot.org
        service_description     DNS IP Match
        }
Hope that clarifies : )
User avatar
KevinD
Posts: 26
Joined: Thu Mar 29, 2012 10:26 am

Re: Service Escalations with Multiple Hosts

Post by KevinD »

Your response was greatly appreciated.

When we originally pulled the config, there was nothing in it, but i honestly can not remember if we applied the config when looking at it.

Configs do look right, and i don't think they will fail to escalate.
We did notice that after the patch, the behavior of the escalation screen changed slightly.

We used to be able to select multiple checks, as well as multiple hosts, and if one of the checks did not exist for one of the hosts, it would not complain. Now we get an error in configuration validation.

Is this expected behavior?

Again, thanks for the clarification, definitely helpful.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Service Escalations with Multiple Hosts

Post by mguthrie »

The downside of the added flexibility in the Core Config Manager is that it removed the controlled input for the service list. The service escalations page will allow you to create a configuration that won't work with Nagios Core, so you'll need to verify that a valid host:service combination exists for all of the hosts, hostgroups, and services that you've selected. If you view the text output from the config verification, you should be able to identify which host(s) are missing those services.

In short, the Core Config Manager will allow you to create service escalations with host:service combinations that don't exist, but Nagios itself doesn't allow that.
Locked