Managing SMNP Traps

vmesquita · Post by **vmesquita** » Mon May 15, 2017 3:44 pm

Hello!

We have been using SMNP traps sucessfully in Nagios for a while now. It works well but has some limitations. For instance, we are using it to manage a network switch. An information about a power supply failing is of course much more relevant than a port being disconnected, however if both alerts come one after the other, the one about the power supply will be overwriten by the newer one on the alert information.

This considered, would there be a better way of managing snmp traps? Or maybe another tool I could use togheter with Nagios, or by itself? Please advise.

dwhitfield · Post by **dwhitfield** » Mon May 15, 2017 4:28 pm

So, let's take a step back. Are you also using SNMP traps for your host check? Are you using host checks at all? If you aren't using host checks, this would be an easy way to avoid the SNMP trap issue. The default is ping, but you can change the host check if you like.

All of the SNMP trap messages should be in the state history, so looking there is an option.

if you are notifying on each critical you should get emails for both. Perhaps you could route messages differently in email folders or with some other rule (like a power outage shows up as high priority in email).

There are other methods, but I think they are more complicated than any of these.

SteveBeauchemin · Post by **SteveBeauchemin** » Mon May 15, 2017 5:06 pm

Thinking a little out of the box... I have a suggestion - in the snmptt.conf file.

Code: Select all

EXEC /usr/local/nagios/libexec/submit_check_result $r TRAP 1 "This SNMP trap is generated when a console detects an issue  $*"

In the EXEC line, the word "TRAP" will be the name of the Nagios service where this result will be displayed. Why does it have to be TRAP, and why limit it to one service?

What if you changed some of those line items to different names. such as PowerSupply, or Interface.

Code: Select all

EVENT [some Power name] [some oid] more words
FORMAT blah blah
EXEC /usr/local/nagios/libexec/submit_check_result $r PowerSupply [severity] "This SNMP trap is generated for Power issue  $*"

or

Code: Select all

EVENT [some Interface thing] [some different oid] more words
FORMAT blah blah
EXEC /usr/local/nagios/libexec/submit_check_result $r Interface [severity] "This SNMP trap is generated when Interface burps $*"

where [severity] is 1 2 3 or whatever you need.

Remember, you control the horizontal and the vertical.

A more complete example - Service is named PortDown, associated to a SAN device.

Code: Select all

EVENT ssaPortDownEvent1 .1.3.6.1.4.1.3764.1.1.400.10.10.1000.0.2 "Status Events" Normal
FORMAT An SSA Port Down Event has occurred. $*
EXEC /usr/local/nagios/libexec/eventhandlers $r PortDown 1 "An SSA Port Down Event has occurred. $*"
SDESC
An SSA Port Down Event has occurred.
Variables:
  1: componentId
  2: paTrapSequenceNumber
  3: paTime
  4: paProducer
  5: paEventClass
  6: paEventCode
  7: paSeq
  8: paEventVars
EDESC

This is fun stuff...

Steve B

dwhitfield · Post by **dwhitfield** » Tue May 16, 2017 9:22 am

Thanks SteveB!

@vmesquita, I think Steve's suggestion is great because it doesn't involve each user changing email boxes, but it does require editing the snmptt.conf on a production system.

Ultimately, I still think the host check is the best way to deal with this issue, if it's not being using for any other purpose, but if the host check doesn't seem like a good option, you certainly have some other options.

SteveBeauchemin · Post by **SteveBeauchemin** » Tue May 16, 2017 12:09 pm

That Trap Translator file can be fun to play with.

Here is one more example... It gets the Alert ID code from the trap, and creates a URL that is in the Nagios Status Information field. You can then click in the Nagios GUI, on the TRAP-AirWave service, to open the Management tool directly to the Event ID passed in from the trap. This for Aruba Wireless Access points and the AirWave Management tool.

Code: Select all

EVENT downAP .1.3.6.1.4.1.12028.4.15.0.13 "Status Events" Normal
FORMAT For Host: APNAME The device is down $*
EXEC /usr/local/nagios/libexec/submit_check_result APNAME TRAP-AirWave $2 "The device is down - $3 $4"
REGEX ((https:\S+):\s+)("\<A HREF=\\\"$1\\\" TARGET=\\\"AirWave\\\"\>Click for AirWave\</A\> ")ei
REGEX (APNAME\s+TRAP-AirWave\s+(.*)\s+Device:\s+(\S+)\s+)(lc($2).".domain.com TRAP-AirWave $1 Device: ".lc($2)." ")ie
REGEX (Host: APNAME\s+(.*)\s+Device:\s+(\S+)\s+)("Host: ".lc($2)." $1 Device: ".lc($2)." ")ie
REGEX (TRAP-AirWave 2)(TRAP-AirWave three)
REGEX (TRAP-AirWave 3)(TRAP-AirWave zero)
REGEX (TRAP-AirWave 4)(TRAP-AirWave one)
REGEX (TRAP-AirWave 5)(TRAP-AirWave two)
REGEX (TRAP-AirWave zero)(TRAP-AirWave 0)
REGEX (TRAP-AirWave one)(TRAP-AirWave 1)
REGEX (TRAP-AirWave two)(TRAP-AirWave 2)
REGEX (TRAP-AirWave three)(TRAP-AirWave 3)
REGEX (\?)(%3F)g
SDESC
This trap is sent when the AP is down
(for instance, a missed SNMP Ping or SNMP Get).
Variables:
  1: awampEventID
  2: awampEventSeverityCode
  3: awampEventDescription
  4: awampAPIP
EDESC

I provides a nice message for the Wireless Admin, they can click in it and jump right to the bad device in their GUI, to the exact event.

Like I said before... fun stuff.

Don't limit yourself. Don't be afraid to try things. Think outside the box but remember "There is no box."

Steve B

dwhitfield · Post by **dwhitfield** » Tue May 16, 2017 12:12 pm

SteveBeauchemin wrote:That Trap Translator file can be fun to play with.

On a test box!

vmesquita · Post by **vmesquita** » Tue May 16, 2017 2:36 pm

dwhitfield and SteveBeauchemin ,

Thanks so much for the suggestions! Both were very helpful. I think however that Steve's solution is better for what I am trying to accomplish, so I'll try to implement.

dwhitfield · Post by **dwhitfield** » Tue May 16, 2017 2:54 pm

Please let us know if you run into any issues.

SteveBeauchemin · Post by **SteveBeauchemin** » Wed May 17, 2017 10:29 am

On a test box!

dwhitfield · Post by **dwhitfield** » Wed May 17, 2017 10:40 am

SteveBeauchemin wrote:On a test box!

You're probably going to want to let us know if you have any problems once this goes into production too, but play time should be over by then.

Nagios Support Forum

Managing SMNP Traps

Managing SMNP Traps

Re: Managing SMNP Traps

Re: Managing SMNP Traps

Re: Managing SMNP Traps

Re: Managing SMNP Traps

Re: Managing SMNP Traps

Re: Managing SMNP Traps

Re: Managing SMNP Traps

Re: Managing SMNP Traps

Re: Managing SMNP Traps