Host Escalations

rentsys · Post by **rentsys** » Fri May 30, 2014 10:40 am

I am setting up Host Escalations so that my clients wont be notified when my network goes down.
I have it so they do not get alerts when the host state is unreachable.
The thing is that they get alerted when the host comes back up, but there device never actually went down.
I read about Host Escalations, and it seems to be what I need.
I found this when reading about Escalations. http://nagios.sourceforge.net/docs/3_0/escalations.html

If, after three problem notifications, a recovery notification is sent out for the service, who gets notified? The recovery is actually the fourth notification that gets sent out. However, the escalation code is smart enough to realize that only those people who were notified about the problem on the third notification should be notified about the recovery. In this case, the nt-admins and managers contact groups would be notified of the recovery.

I made this

Code: Select all

define hostescalation {
        host_name                               TestUnknown124,TestUnknown125
        contacts                                rentsys
        first_notification                      1
        last_notification                       0
        notification_interval                   0
        escalation_period                       24x7
        escalation_options                      u,
        }

The contact assigned to TestUnknown124 and TestUnknown125 is robertsa
TestUnknown124 is the parent of TestUnknown125.
I bring both devices down to simulate a network outage.
TestUnknown124 goes down and TestUnknown125 goes unreachable.
The host escalation works.
There is a down email sent to robertsa and an unreachable email sent to rentsys.
The problem is that when I bring TestUnknown124 and TestUnknown125 back up it emails robertsa that they are both back up.
Instead of sending an up email for TestUnknown124 to robertsa and an up email for TestUnknown125 to rentsys.
I am running 2014 Production.
Did I set it up correctly?

slansing · Post by **slansing** » Fri May 30, 2014 12:27 pm

The problem is that when I bring TestUnknown124 and TestUnknown125 back up it emails robertsa that they are both back up.
Instead of sending an up email for TestUnknown124 to robertsa and an up email for TestUnknown125 to rentsys.

So your issue is that rentsys is not notified of the recovery? If it is recovering from a down state you will need to place "r" in your options as well. Is rentsys not assigned to those hosts normally? Try to replicate how you have robertsa defined on them.

rentsys · Post by **rentsys** » Fri May 30, 2014 12:57 pm

Yes, that is the issue.
robertsa shouldn't be notified about TestUnknown125.
Only rentsys should be notified.
If I place in the "r" it notifies rentsys of TestUnknown124 and TestUnknown125.
When it should only be TestUnknown125.
rentsys is not originally on the hosts.

Code: Select all

define host {
        host_name                       TestUnknown124
        use                             xiwizard_genericnetdevice_host
        alias                           alias
        address                         XXX.XXX.XXX.XXX
        initial_state                   u
        max_check_attempts              1
        check_interval                  1
        retry_interval                  1
        active_checks_enabled           1
        passive_checks_enabled          0
        check_period                    xi_timeperiod_24x7
        flap_detection_enabled          0
        contacts                        robertsa
        notification_interval           0
        notification_period             xi_timeperiod_24x7
        notifications_enabled           1
        icon_image                      network_node.png
        _xiwizard                       genericnetdevice
        register                        1
        }
define host {
        host_name                       TestUnknown125
        use                             xiwizard_genericnetdevice_host
        alias                           alias
        address                         XXX.XXX.XXX.XXX
        parents                         TestUnknown124
        initial_state                   u
        max_check_attempts              1
        check_interval                  1
        retry_interval                  1
        active_checks_enabled           1
        passive_checks_enabled          0
        check_period                    xi_timeperiod_24x7
        flap_detection_enabled          0
        contacts                        robertsa
        notification_interval           0
        notification_period             xi_timeperiod_24x7
        notifications_enabled           1
        icon_image                      network_node.png
        _xiwizard                       genericnetdevice
        register                        1
        }

Shouldn't the escalation code only notify the people, who were notified about the problem, about the recovery?

tmcdonald · Post by **tmcdonald** » Fri May 30, 2014 3:19 pm

Can you confirm if you are using Core or XI? You are posting in the XI forum but you seem to be using Core configs. If you are using XI it might be helpful if you PM'ed one of us your system profile. If using core, a zip of your etc directory will work as well. In either case, send along /usr/local/nagios/var/objects.cache as well.

rentsys · Post by **rentsys** » Fri May 30, 2014 4:06 pm

I am using Nagios XI 2014R1.0. I'll pm tmcdonald.

tmcdonald · Post by **tmcdonald** » Mon Jun 02, 2014 11:11 am

rentsys wrote: robertsa shouldn't be notified about TestUnknown125.
Only rentsys should be notified.

So robertsa should *never* be notified? In that case you might want to add in a "dummy" escalation that covers all of the cases where rentsys is not being notified, so that robertsa never gets sent anything for the appropriate host.

rentsys · Post by **rentsys** » Mon Jun 02, 2014 11:32 am

So normally robertsa should be notified, but if there is a network outage robertsa shouldn't be alerted.
when the host robertsa is assigned to goes unreachable first; robertsa shouldn't be alerted when it comes back up to an OK state.
Can you elaborate on the "dummy" escalation?

slansing · Post by **slansing** » Tue Jun 03, 2014 9:04 am

I believe what tmcdonald was referencing is the first escalation should be a fake contact, one that does not actually need to be notified, with an address that could point to anywhere, that way, it will effectively skip those notifications by going through them in the escalation chain. I'll double check what he meant when he gets in.

rentsys · Post by **rentsys** » Tue Jun 03, 2014 10:05 am

isn't that what I set up? other than the contact pointing to a real address.

scottwilkerson · Post by **scottwilkerson** » Tue Jun 03, 2014 3:03 pm

rentsys wrote:...
I made this
Code: Select all
define hostescalation {
        host_name                               TestUnknown124,TestUnknown125
        contacts                                rentsys
        first_notification                      1
        last_notification                       0
        notification_interval                   0
        escalation_period                       24x7
        escalation_options                      u,
        }
The contact assigned to TestUnknown124 and TestUnknown125 is robertsa
TestUnknown124 is the parent of TestUnknown125.
I bring both devices down to simulate a network outage.
TestUnknown124 goes down and TestUnknown125 goes unreachable.
The host escalation works.
There is a down email sent to robertsa and an unreachable email sent to rentsys.
The problem is that when I bring TestUnknown124 and TestUnknown125 back up it emails robertsa that they are both back up.
Instead of sending an up email for TestUnknown124 to robertsa and an up email for TestUnknown125 to rentsys.
I am running 2014 Production.
Did I set it up correctly?

You are missing the r in escalation_options

should be

Code: Select all

        escalation_options                      u,r

This directive is used to define the criteria that determine when this host escalation is used. The escalation is used only if the host is in one of the states specified in this directive. If this directive is not specified in a host escalation, the escalation is considered to be valid during all host states. Valid options are a combination of one or more of the following: r = escalate on an UP (recovery) state, d = escalate on a DOWN state, and u = escalate on an UNREACHABLE state. Example: If you specify d in this field, the escalation will only be used if the host is in a DOWN state.

Nagios Support Forum

Host Escalations

Host Escalations

Re: Host Escalations

Re: Host Escalations

Re: Host Escalations

Re: Host Escalations

Re: Host Escalations

Re: Host Escalations

Re: Host Escalations

Re: Host Escalations

Re: Host Escalations