Trouble getting SNMP Traps to show in Nagios XI

rkymtnhigh · Post by **rkymtnhigh** » Tue Jul 28, 2015 11:20 am

I've followed the installation notes here:

http://askaralikhan.blogspot.com/2010/1 ... agios.html

https://assets.nagios.com/downloads/nag ... ios_XI.pdf

I imported and translated many MIB's for 2 switches and I am currently monitoring traps on both of them but it is stuck on TRAP RESET.

The snmptt.log file shows traps being sent from these devices to nagios, but nothing is recorded in the events in Nagios UI.

I see unknown events too, from the same switch(es), so that confuses me. I keep reading about NSTI, but it doesn't sound like it's a requirement to get basic traps working.

I admit I still don't completely understand the traps and how they are set up in Nagios, but I'm hoping you guys can point me in the right direction.

Thanks!!

RMH

ssax · Post by **ssax** » Tue Jul 28, 2015 12:25 pm

The second link you posted is the one that you should use.

The way traps work is the device sends a trap to your XI server, snmpdtrapd processes the trap and feeds it into snmptt, snmptt then looks in the /etc/snmp/snmptt.conf file and if the OID is in there will process the EXEC line of the trap (which should call /usr/local/bin/snmptraphandling.py) and that script will send it into Nagios.

As long as you have the proper mibs loaded, and have restarted snmptt it should work:

Code: Select all

service snmptt restart

You might want to check Admin > Monitoring Config > Unconfigured Objects to see if they are showing up there, the name of the host in the trap and in XI need to be identical.

You might want to run this command and watch where the traps are going (unknown or not):

Code: Select all

tail -f /var/log/snmptt/snmptt.*

If they are going into the snmpttunknown.log file then you don't have the proper MIBs added.

rkymtnhigh · Post by **rkymtnhigh** » Tue Jul 28, 2015 12:45 pm

Thanks ssax!

I restarted snmptt service and checked for unconfigured objects, unfortunately there were none.

I have run the SNMP Trap Wizard on the two switches to monitor for traps. They say TRAP RESET currently.

I can see the traps going into the snmptt.log file, but they still do not show up in Nagios.

I still see occasional unknown traps going into the unknown log file, but it seems like if other traps are landing in the known log file, they should be showing up in Nagios.

Thanks again,

RMH

EDIT: I deleted the trap service for my test host (switch) and sent some more traps, trying to get it to show up in unconfigured objects. After many successful traps, there are no unconfigured objects.

ssax · Post by **ssax** » Tue Jul 28, 2015 1:17 pm

Please post the trap definition of a trap OID that you know is coming in from your /etc/snmp/snmptt.conf file so that we can see why it's not going into Nagios.

rkymtnhigh · Post by **rkymtnhigh** » Tue Jul 28, 2015 1:24 pm

Sure! Here is my linkDown:

Code: Select all

EVENT linkDown .1.3.6.1.6.3.1.1.5.3 "Status Events" Critical
FORMAT Link down on interface $1.  Admin state: $2.  Operational state: $3
#EXEC qpage -f TRAP notifygroup1 "Link down on interface $1.  Admin state: $2.  Operational state: $3"
SDESC
A linkDown trap signifies that the SNMP entity, acting in
an agent role, has detected that the ifOperStatus object for
one of its communication links is about to enter the down
state from some other state (but not from the notPresent
state).  This other state is indicated by the included value
of ifOperStatus.
EDESC

And a catchALL:

Code: Select all

EVENT CatchAll .1.* "SNMP Traps" Critical
FORMAT $D
EXEC /usr/local/nagios/libexec/eventhandlers/submit_check_result "$r"
"snmp_traps" 2 "$O: $1 $2 $3 $4 $5"
EDESC

Thank you!

ssax · Post by **ssax** » Tue Jul 28, 2015 1:35 pm

Remove the linkDown and linkUp traps from your /etc/snmp/snmptt.conf

Then run this command:

Code: Select all

addmib /usr/share/snmp/mibs/IF-MIB.txt

Then it will put them in there properly, if the EXEC line does not have /usr/local/bin/snmptraphandling.py in it like below, it will not work:

Code: Select all

EVENT linkDown .1.3.6.1.6.3.1.1.5.3 "Status Events" Normal
FORMAT A linkDown trap signifies that the SNMP entity, acting in $*
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "A linkDown trap signifies that the SNMP entity, acting in $*"
SDESC
A linkDown trap signifies that the SNMP entity, acting in
an agent role, has detected that the ifOperStatus object for
one of its communication links is about to enter the down
state from some other state (but not from the notPresent
state).  This other state is indicated by the included value
of ifOperStatus.
Variables:
  1: ifIndex
  2: ifAdminStatus
  3: ifOperStatus
EDESC
#
#
#
EVENT linkUp .1.3.6.1.6.3.1.1.5.4 "Status Events" Normal
FORMAT A linkUp trap signifies that the SNMP entity, acting in an $*
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "A linkUp trap signifies that the SNMP entity, acting in an $*"
SDESC
A linkUp trap signifies that the SNMP entity, acting in an
agent role, has detected that the ifOperStatus object for
one of its communication links left the down state and
transitioned into some other state (but not into the
notPresent state).  This other state is indicated by the
included value of ifOperStatus.
Variables:
  1: ifIndex
  2: ifAdminStatus
  3: ifOperStatus
EDESC

Then restart snmptt:

Code: Select all

service snmptt restart

ssax · Post by **ssax** » Tue Jul 28, 2015 1:37 pm

Also, you'll have to change them from Normal to Critical if that's what you want.

rkymtnhigh · Post by **rkymtnhigh** » Tue Jul 28, 2015 2:40 pm

There were several places where linkUp and linkDown were set in the snmptt.conf file.

I have removed all of the entries, and ran the addmib. This placed these entries at the end of my file, with the .py script called.

I have changed it to linkDown = Critical and can generate the Critical events in the snmptt.log file.

However nothing shows in the UI still. I see "No check results for service yet..."

I think we're getting closer, both in my understanding and in a working trap service!

When I run: echo 'check table nagios_systemcommands;' | mysql -t -pnagiosxi nagios
Here is what I get:
+------------------------------+-------+----------+-----------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+------------------------------+-------+----------+-----------------------------------------------+
| nagios.nagios_systemcommands | check | Error | Incorrect file format 'nagios_systemcommands' |
| nagios.nagios_systemcommands | check | error | Corrupt |
+------------------------------+-------+----------+-----------------------------------------------+

I also see the incorrect file format message over and over again in /var/log/messages

EDIT2: I have fixed the corrupt tables, but am still not getting the events in NagiosXI. Here is the messages log after a Critical trap is sent

Jul 28 14:22:12 nag01-dev snmptrapd[29707]: 2015-07-28 14:22:12 192.168.XX.XXX(via UDP: [192.168.XX.XXX]:52809->[192.168.XX.XXX]) TRAP, SNMP v1, community XXXXXXXXX#012#011SNMPv2-MIB::snmpTraps Link Down Trap (0) Uptime: 10 days, 21:17:11.01#012#011IF-MIB::ifIndex.24067 = INTEGER: 24067#011IF-MIB::ifDescr.24067 = STRING: Loopback10#011IF-MIB::ifType.24067 = INTEGER: softwareLoopback(24)#011SNMPv2-SMI::enterprises.9.2.2.1.1.20.24067 = STRING: "administratively down"
Jul 28 14:22:14 nag01-dev nagios: Error: External command failed -> PROCESS_SERVICE_CHECK_RESULT;192.168.XX.XXX;SNMP Traps;2;A linkDown trap signifies that the SNMP entity, acting in 24067 Loopback10 softwareLoopback administratively down / ifIndex.24067 (INTEGER32):24067 ifDescr.24067 (OCTETSTR):Loopback10 ifType.24067 (INTEGER):softwareLoopback enterprises.9.2.2.1.1.20.24067 ():administratively down
Jul 28 14:22:14 nag01-dev nagios: External command error: Command failed
Jul 28 14:22:14 nag01-dev nagios: Error: External command failed -> PROCESS_SERVICE_CHECK_RESULT;192.168.XX.XXX;SNMP Traps;2;A linkDown trap signifies that the SNMP entity, acting in 24067 Loopback10 softwareLoopback administratively down / ifIndex.24067 (INTEGER32):24067 ifDescr.24067 (OCTETSTR):Loopback10 ifType.24067 (INTEGER):softwareLoopback enterprises.9.2.2.1.1.20.24067 ():administratively down
Jul 28 14:22:14 nag01-dev nagios: External command error: Command failed

External command error: Command failed - seems to be the issue now!

I have also noticed it is saying SNMP v1, when we need to be using v2. I'm not sure where that is configured!

ssax · Post by **ssax** » Tue Jul 28, 2015 5:12 pm

Please post the output of this command:

Code: Select all

ls -l /usr/local/bin

[/s]

ssax · Post by **ssax** » Tue Jul 28, 2015 5:15 pm

Sorry, that likely won't be it.

Are they showing in Admin > Unconfigured Objects now?

Nagios Support Forum

Trouble getting SNMP Traps to show in Nagios XI

Trouble getting SNMP Traps to show in Nagios XI

Re: Trouble getting SNMP Traps to come to Nagios

Re: Trouble getting SNMP Traps to come to Nagios

Re: Trouble getting SNMP Traps to show in Nagios XI

Re: Trouble getting SNMP Traps to show in Nagios XI

Re: Trouble getting SNMP Traps to show in Nagios XI

Re: Trouble getting SNMP Traps to show in Nagios XI

Re: Trouble getting SNMP Traps to show in Nagios XI

Re: Trouble getting SNMP Traps to show in Nagios XI

Re: Trouble getting SNMP Traps to show in Nagios XI