Managing SMNP Traps
Managing SMNP Traps
Hello!
We have been using SMNP traps sucessfully in Nagios for a while now. It works well but has some limitations. For instance, we are using it to manage a network switch. An information about a power supply failing is of course much more relevant than a port being disconnected, however if both alerts come one after the other, the one about the power supply will be overwriten by the newer one on the alert information.
This considered, would there be a better way of managing snmp traps? Or maybe another tool I could use togheter with Nagios, or by itself? Please advise.
We have been using SMNP traps sucessfully in Nagios for a while now. It works well but has some limitations. For instance, we are using it to manage a network switch. An information about a power supply failing is of course much more relevant than a port being disconnected, however if both alerts come one after the other, the one about the power supply will be overwriten by the newer one on the alert information.
This considered, would there be a better way of managing snmp traps? Or maybe another tool I could use togheter with Nagios, or by itself? Please advise.
-
dwhitfield
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: Managing SMNP Traps
So, let's take a step back. Are you also using SNMP traps for your host check? Are you using host checks at all? If you aren't using host checks, this would be an easy way to avoid the SNMP trap issue. The default is ping, but you can change the host check if you like.
All of the SNMP trap messages should be in the state history, so looking there is an option.
if you are notifying on each critical you should get emails for both. Perhaps you could route messages differently in email folders or with some other rule (like a power outage shows up as high priority in email).
There are other methods, but I think they are more complicated than any of these.
All of the SNMP trap messages should be in the state history, so looking there is an option.
if you are notifying on each critical you should get emails for both. Perhaps you could route messages differently in email folders or with some other rule (like a power outage shows up as high priority in email).
There are other methods, but I think they are more complicated than any of these.
-
SteveBeauchemin
- Posts: 524
- Joined: Mon Oct 14, 2013 7:19 pm
Re: Managing SMNP Traps
Thinking a little out of the box... I have a suggestion - in the snmptt.conf file.
In the EXEC line, the word "TRAP" will be the name of the Nagios service where this result will be displayed. Why does it have to be TRAP, and why limit it to one service?
What if you changed some of those line items to different names. such as PowerSupply, or Interface.or
where [severity] is 1 2 3 or whatever you need.
Remember, you control the horizontal and the vertical.
A more complete example - Service is named PortDown, associated to a SAN device.
This is fun stuff...
Steve B
Code: Select all
EXEC /usr/local/nagios/libexec/submit_check_result $r TRAP 1 "This SNMP trap is generated when a console detects an issue $*"What if you changed some of those line items to different names. such as PowerSupply, or Interface.
Code: Select all
EVENT [some Power name] [some oid] more words
FORMAT blah blah
EXEC /usr/local/nagios/libexec/submit_check_result $r PowerSupply [severity] "This SNMP trap is generated for Power issue $*"
Code: Select all
EVENT [some Interface thing] [some different oid] more words
FORMAT blah blah
EXEC /usr/local/nagios/libexec/submit_check_result $r Interface [severity] "This SNMP trap is generated when Interface burps $*"Remember, you control the horizontal and the vertical.
A more complete example - Service is named PortDown, associated to a SAN device.
Code: Select all
EVENT ssaPortDownEvent1 .1.3.6.1.4.1.3764.1.1.400.10.10.1000.0.2 "Status Events" Normal
FORMAT An SSA Port Down Event has occurred. $*
EXEC /usr/local/nagios/libexec/eventhandlers $r PortDown 1 "An SSA Port Down Event has occurred. $*"
SDESC
An SSA Port Down Event has occurred.
Variables:
1: componentId
2: paTrapSequenceNumber
3: paTime
4: paProducer
5: paEventClass
6: paEventCode
7: paSeq
8: paEventVars
EDESC
Steve B
XI 5.7.3 / Core 4.4.6 / NagVis 1.9.8 / LiveStatus 1.5.0p11 / RRDCached 1.7.0 / Redis 3.2.8 /
SNMPTT / Gearman 0.33-7 / Mod_Gearman 3.0.7 / NLS 2.0.8 / NNA 2.3.1 /
NSClient 0.5.0 / NRPE Solaris 3.2.1 Linux 3.2.1 HPUX 3.2.1
SNMPTT / Gearman 0.33-7 / Mod_Gearman 3.0.7 / NLS 2.0.8 / NNA 2.3.1 /
NSClient 0.5.0 / NRPE Solaris 3.2.1 Linux 3.2.1 HPUX 3.2.1
-
dwhitfield
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: Managing SMNP Traps
Thanks SteveB!
@vmesquita, I think Steve's suggestion is great because it doesn't involve each user changing email boxes, but it does require editing the snmptt.conf on a production system.
Ultimately, I still think the host check is the best way to deal with this issue, if it's not being using for any other purpose, but if the host check doesn't seem like a good option, you certainly have some other options.
@vmesquita, I think Steve's suggestion is great because it doesn't involve each user changing email boxes, but it does require editing the snmptt.conf on a production system.
Ultimately, I still think the host check is the best way to deal with this issue, if it's not being using for any other purpose, but if the host check doesn't seem like a good option, you certainly have some other options.
-
SteveBeauchemin
- Posts: 524
- Joined: Mon Oct 14, 2013 7:19 pm
Re: Managing SMNP Traps
That Trap Translator file can be fun to play with.
Here is one more example... It gets the Alert ID code from the trap, and creates a URL that is in the Nagios Status Information field. You can then click in the Nagios GUI, on the TRAP-AirWave service, to open the Management tool directly to the Event ID passed in from the trap. This for Aruba Wireless Access points and the AirWave Management tool.
I provides a nice message for the Wireless Admin, they can click in it and jump right to the bad device in their GUI, to the exact event.
Like I said before... fun stuff.
Don't limit yourself. Don't be afraid to try things. Think outside the box but remember "There is no box."
Steve B
Here is one more example... It gets the Alert ID code from the trap, and creates a URL that is in the Nagios Status Information field. You can then click in the Nagios GUI, on the TRAP-AirWave service, to open the Management tool directly to the Event ID passed in from the trap. This for Aruba Wireless Access points and the AirWave Management tool.
Code: Select all
EVENT downAP .1.3.6.1.4.1.12028.4.15.0.13 "Status Events" Normal
FORMAT For Host: APNAME The device is down $*
EXEC /usr/local/nagios/libexec/submit_check_result APNAME TRAP-AirWave $2 "The device is down - $3 $4"
REGEX ((https:\S+):\s+)("\<A HREF=\\\"$1\\\" TARGET=\\\"AirWave\\\"\>Click for AirWave\</A\> ")ei
REGEX (APNAME\s+TRAP-AirWave\s+(.*)\s+Device:\s+(\S+)\s+)(lc($2).".domain.com TRAP-AirWave $1 Device: ".lc($2)." ")ie
REGEX (Host: APNAME\s+(.*)\s+Device:\s+(\S+)\s+)("Host: ".lc($2)." $1 Device: ".lc($2)." ")ie
REGEX (TRAP-AirWave 2)(TRAP-AirWave three)
REGEX (TRAP-AirWave 3)(TRAP-AirWave zero)
REGEX (TRAP-AirWave 4)(TRAP-AirWave one)
REGEX (TRAP-AirWave 5)(TRAP-AirWave two)
REGEX (TRAP-AirWave zero)(TRAP-AirWave 0)
REGEX (TRAP-AirWave one)(TRAP-AirWave 1)
REGEX (TRAP-AirWave two)(TRAP-AirWave 2)
REGEX (TRAP-AirWave three)(TRAP-AirWave 3)
REGEX (\?)(%3F)g
SDESC
This trap is sent when the AP is down
(for instance, a missed SNMP Ping or SNMP Get).
Variables:
1: awampEventID
2: awampEventSeverityCode
3: awampEventDescription
4: awampAPIP
EDESC
Like I said before... fun stuff.
Don't limit yourself. Don't be afraid to try things. Think outside the box but remember "There is no box."
Steve B
XI 5.7.3 / Core 4.4.6 / NagVis 1.9.8 / LiveStatus 1.5.0p11 / RRDCached 1.7.0 / Redis 3.2.8 /
SNMPTT / Gearman 0.33-7 / Mod_Gearman 3.0.7 / NLS 2.0.8 / NNA 2.3.1 /
NSClient 0.5.0 / NRPE Solaris 3.2.1 Linux 3.2.1 HPUX 3.2.1
SNMPTT / Gearman 0.33-7 / Mod_Gearman 3.0.7 / NLS 2.0.8 / NNA 2.3.1 /
NSClient 0.5.0 / NRPE Solaris 3.2.1 Linux 3.2.1 HPUX 3.2.1
-
dwhitfield
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: Managing SMNP Traps
On a test box!SteveBeauchemin wrote:That Trap Translator file can be fun to play with.
Re: Managing SMNP Traps
dwhitfield and SteveBeauchemin ,
Thanks so much for the suggestions! Both were very helpful. I think however that Steve's solution is better for what I am trying to accomplish, so I'll try to implement.
Thanks so much for the suggestions! Both were very helpful. I think however that Steve's solution is better for what I am trying to accomplish, so I'll try to implement.
-
dwhitfield
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: Managing SMNP Traps
Please let us know if you run into any issues.
-
SteveBeauchemin
- Posts: 524
- Joined: Mon Oct 14, 2013 7:19 pm
Re: Managing SMNP Traps
On a test box! 
XI 5.7.3 / Core 4.4.6 / NagVis 1.9.8 / LiveStatus 1.5.0p11 / RRDCached 1.7.0 / Redis 3.2.8 /
SNMPTT / Gearman 0.33-7 / Mod_Gearman 3.0.7 / NLS 2.0.8 / NNA 2.3.1 /
NSClient 0.5.0 / NRPE Solaris 3.2.1 Linux 3.2.1 HPUX 3.2.1
SNMPTT / Gearman 0.33-7 / Mod_Gearman 3.0.7 / NLS 2.0.8 / NNA 2.3.1 /
NSClient 0.5.0 / NRPE Solaris 3.2.1 Linux 3.2.1 HPUX 3.2.1
-
dwhitfield
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: Managing SMNP Traps
You're probably going to want to let us know if you have any problems once this goes into production too, but play time should be over by then.SteveBeauchemin wrote:On a test box!