Host Escalations

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
rentsys
Posts: 98
Joined: Wed Oct 16, 2013 11:57 am

Host Escalations

Post by rentsys »

I am setting up Host Escalations so that my clients wont be notified when my network goes down.
I have it so they do not get alerts when the host state is unreachable.
The thing is that they get alerted when the host comes back up, but there device never actually went down.
I read about Host Escalations, and it seems to be what I need.
I found this when reading about Escalations. http://nagios.sourceforge.net/docs/3_0/escalations.html
If, after three problem notifications, a recovery notification is sent out for the service, who gets notified? The recovery is actually the fourth notification that gets sent out. However, the escalation code is smart enough to realize that only those people who were notified about the problem on the third notification should be notified about the recovery. In this case, the nt-admins and managers contact groups would be notified of the recovery.
I made this

Code: Select all

define hostescalation {
        host_name                               TestUnknown124,TestUnknown125
        contacts                                rentsys
        first_notification                      1
        last_notification                       0
        notification_interval                   0
        escalation_period                       24x7
        escalation_options                      u,
        }
The contact assigned to TestUnknown124 and TestUnknown125 is robertsa
TestUnknown124 is the parent of TestUnknown125.
I bring both devices down to simulate a network outage.
TestUnknown124 goes down and TestUnknown125 goes unreachable.
The host escalation works.
There is a down email sent to robertsa and an unreachable email sent to rentsys.
The problem is that when I bring TestUnknown124 and TestUnknown125 back up it emails robertsa that they are both back up.
Instead of sending an up email for TestUnknown124 to robertsa and an up email for TestUnknown125 to rentsys.
I am running 2014 Production.
Did I set it up correctly?
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Host Escalations

Post by slansing »

The problem is that when I bring TestUnknown124 and TestUnknown125 back up it emails robertsa that they are both back up.
Instead of sending an up email for TestUnknown124 to robertsa and an up email for TestUnknown125 to rentsys.
So your issue is that rentsys is not notified of the recovery? If it is recovering from a down state you will need to place "r" in your options as well. Is rentsys not assigned to those hosts normally? Try to replicate how you have robertsa defined on them.
rentsys
Posts: 98
Joined: Wed Oct 16, 2013 11:57 am

Re: Host Escalations

Post by rentsys »

Yes, that is the issue.
robertsa shouldn't be notified about TestUnknown125.
Only rentsys should be notified.
If I place in the "r" it notifies rentsys of TestUnknown124 and TestUnknown125.
When it should only be TestUnknown125.
rentsys is not originally on the hosts.

Code: Select all

define host {
        host_name                       TestUnknown124
        use                             xiwizard_genericnetdevice_host
        alias                           alias
        address                         XXX.XXX.XXX.XXX
        initial_state                   u
        max_check_attempts              1
        check_interval                  1
        retry_interval                  1
        active_checks_enabled           1
        passive_checks_enabled          0
        check_period                    xi_timeperiod_24x7
        flap_detection_enabled          0
        contacts                        robertsa
        notification_interval           0
        notification_period             xi_timeperiod_24x7
        notifications_enabled           1
        icon_image                      network_node.png
        _xiwizard                       genericnetdevice
        register                        1
        }
define host {
        host_name                       TestUnknown125
        use                             xiwizard_genericnetdevice_host
        alias                           alias
        address                         XXX.XXX.XXX.XXX
        parents                         TestUnknown124
        initial_state                   u
        max_check_attempts              1
        check_interval                  1
        retry_interval                  1
        active_checks_enabled           1
        passive_checks_enabled          0
        check_period                    xi_timeperiod_24x7
        flap_detection_enabled          0
        contacts                        robertsa
        notification_interval           0
        notification_period             xi_timeperiod_24x7
        notifications_enabled           1
        icon_image                      network_node.png
        _xiwizard                       genericnetdevice
        register                        1
        }
Shouldn't the escalation code only notify the people, who were notified about the problem, about the recovery?
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Host Escalations

Post by tmcdonald »

Can you confirm if you are using Core or XI? You are posting in the XI forum but you seem to be using Core configs. If you are using XI it might be helpful if you PM'ed one of us your system profile. If using core, a zip of your etc directory will work as well. In either case, send along /usr/local/nagios/var/objects.cache as well.
Former Nagios employee
rentsys
Posts: 98
Joined: Wed Oct 16, 2013 11:57 am

Re: Host Escalations

Post by rentsys »

I am using Nagios XI 2014R1.0. I'll pm tmcdonald.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Host Escalations

Post by tmcdonald »

rentsys wrote: robertsa shouldn't be notified about TestUnknown125.
Only rentsys should be notified.
So robertsa should *never* be notified? In that case you might want to add in a "dummy" escalation that covers all of the cases where rentsys is not being notified, so that robertsa never gets sent anything for the appropriate host.
Former Nagios employee
rentsys
Posts: 98
Joined: Wed Oct 16, 2013 11:57 am

Re: Host Escalations

Post by rentsys »

So normally robertsa should be notified, but if there is a network outage robertsa shouldn't be alerted.
when the host robertsa is assigned to goes unreachable first; robertsa shouldn't be alerted when it comes back up to an OK state.
Can you elaborate on the "dummy" escalation?
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Host Escalations

Post by slansing »

I believe what tmcdonald was referencing is the first escalation should be a fake contact, one that does not actually need to be notified, with an address that could point to anywhere, that way, it will effectively skip those notifications by going through them in the escalation chain. I'll double check what he meant when he gets in.
rentsys
Posts: 98
Joined: Wed Oct 16, 2013 11:57 am

Re: Host Escalations

Post by rentsys »

isn't that what I set up? other than the contact pointing to a real address.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Host Escalations

Post by scottwilkerson »

rentsys wrote:...
I made this

Code: Select all

define hostescalation {
        host_name                               TestUnknown124,TestUnknown125
        contacts                                rentsys
        first_notification                      1
        last_notification                       0
        notification_interval                   0
        escalation_period                       24x7
        escalation_options                      u,
        }
The contact assigned to TestUnknown124 and TestUnknown125 is robertsa
TestUnknown124 is the parent of TestUnknown125.
I bring both devices down to simulate a network outage.
TestUnknown124 goes down and TestUnknown125 goes unreachable.
The host escalation works.
There is a down email sent to robertsa and an unreachable email sent to rentsys.
The problem is that when I bring TestUnknown124 and TestUnknown125 back up it emails robertsa that they are both back up.
Instead of sending an up email for TestUnknown124 to robertsa and an up email for TestUnknown125 to rentsys.
I am running 2014 Production.
Did I set it up correctly?
You are missing the r in escalation_options

should be

Code: Select all

        escalation_options                      u,r
This directive is used to define the criteria that determine when this host escalation is used. The escalation is used only if the host is in one of the states specified in this directive. If this directive is not specified in a host escalation, the escalation is considered to be valid during all host states. Valid options are a combination of one or more of the following: r = escalate on an UP (recovery) state, d = escalate on a DOWN state, and u = escalate on an UNREACHABLE state. Example: If you specify d in this field, the escalation will only be used if the host is in a DOWN state.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked