Escalation of Passive SNMP traps

evil_del · Post by **evil_del** » Thu Feb 05, 2015 5:13 am

Hi,

Problem:
We have a switch that sends Nagios an SNMP trap when links go down and when they recover. Each trap creates an SMS notification for the Support Engineer. Sometimes, links go down for only a few seconds and we would like to avoid an SMS being sent in this instance. We would like every single trap to still be recorded in Nagios, but would only like notified if (the link is down for >1min).

Attempted solution:
The first idea was to create escalations for this service. The first notification would only send an email to support and not create an SMS. After 1 min, this would be escalated to the Support Engineer via SMS. After a further 20 mins, it would escalated to the Shadow Support Engineer.

The config is below, but during testing this did not work as intended. As soon as the link went down the email was sent as expected, but after 1 min, there was no SMS notification. So therefore no escalation.

My theory:
Since the SNMP trap is passive, once the link goes down there are no further traps to trigger the escalation of this notification. Is this true? Can this problem be escalated if its a passive check?
What is the best way to handle escalation of SNMP traps?

Thanks in advance.

Code: Select all

///// from /etc/nagios/escalations.cfg

define serviceescalation {
        host_name               switch
        service_description     link1
        first_notification      1
        last_notification       0
        contact_groups          support_email_only
        notification_interval   1
}
 
define serviceescalation {
        host_name               switch
        service_description     link1
        first_notification      2
        last_notification       0
        contact_groups          sms_primary_support_engineer
        notification_interval   20
}

define serviceescalation {
        host_name               switch
        service_description     link1
        first_notification      3
        last_notification       0
        contact_groups          sms_shadow_support_engineer
        notification_interval   20
}

Post by **Box293** » Thu Feb 05, 2015 10:10 pm

evil_del wrote:My theory:
Since the SNMP trap is passive, once the link goes down there are no further traps to trigger the escalation of this notification. Is this true? Can this problem be escalated if its a passive check?

Your understanding of this is spot on correct.

evil_del wrote:What is the best way to handle escalation of SNMP traps?

I did some reading on this and some other people had similar issues. The solutions proposed centred around event handlers and checking if the problem had been acknowledged. Perhaps creating a seperate check in Nagios that queries all passive checks to find any that have not been acknowledged and then alert based off that.

Nagios Support Forum

Escalation of Passive SNMP traps

Escalation of Passive SNMP traps

Re: Escalation of Passive SNMP traps