SNMP Traps coming in as experimental

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

SNMP Traps coming in as experimental

Post by WillemDH »

Hello,

Could I get some help troubleshooting this issue. I configured a brocade switch to send an snmp trap every minute. I imported several MIB files, first in the wrong sequence, as some seemed to be dependant of others. I was able to import them without errors, the traps are coming in the service, but the description of the trap is always "5 / experimental.94.1.8.1.4.16.0.0.39.248.11.151.208.0.0.0.0.0.0.0.0.3 ():5"

I have no idea where this is coming from. The description shopuld be somthing like "temperature above thresholds." SOmehow the trap translation is confused somewhere.

Any tips how to troubleshoot this? I used http://assets.nagios.com/downloads/nagi ... ios_XI.pdf
In the snmpttunknown.log, there are no entries. In the snmptt.log I can see the traps coming in:

Code: Select all

Mon Jan 12 16:11:06 2015 .1.3.6.1.3.94.0.5 Normal "Status Events" dg_brocade - 5
Mon Jan 12 16:11:36 2015 .1.3.6.1.3.94.0.5 Normal "Status Events" dg1_brocade02 - 5
Mon Jan 12 16:12:06 2015 .1.3.6.1.3.94.0.5 Normal "Status Events" dg1_brocade02 - 5
Mon Jan 12 16:12:36 2015 .1.3.6.1.3.94.0.5 Normal "Status Events" dg1_brocade02 - 5
Mon Jan 12 16:13:07 2015 .1.3.6.1.3.94.0.5 Normal "Status Events" dg1_brocade02 - 5
Mon Jan 12 16:13:37 2015 .1.3.6.1.3.94.0.5 Normal "Status Events" dg1_brocade02 - 5
Mon Jan 12 16:14:07 2015 .1.3.6.1.3.94.0.5 Normal "Status Events" dg1_brocade02 - 5
Mon Jan 12 16:14:37 2015 .1.3.6.1.3.94.0.5 Normal "Status Events" dg1_brocade02 - 5
Mon Jan 12 16:15:08 2015 .1.3.6.1.3.94.0.5 Normal "Status Events" dg1_brocade02 - 5
Mon Jan 12 16:15:38 2015 .1.3.6.1.3.94.0.5 Normal "Status Events" dg1_brocade02 - 5
Mon Jan 12 16:16:08 2015 .1.3.6.1.3.94.0.5 Normal "Status Events" dg1_brocade02 - 5
Mon Jan 12 16:16:38 2015 .1.3.6.1.3.94.0.5 Normal "Status Events" dg1_brocade02 - 5
Mon Jan 12 16:17:09 2015 .1.3.6.1.3.94.0.5 Normal "Status Events" dg1_brocade02 - 5
Mon Jan 12 16:17:45 2015 .1.3.6.1.3.94.0.5 Normal "Status Events" dg1_brocade02 - 5
Mon Jan 12 16:17:45 2015 .1.3.6.1.3.94.0.5 Normal "Status Events" dg1_brocade02 - 5
Mon Jan 12 16:22:10 2015 .1.3.6.1.3.94.0.5 Normal "Status Events" dg1_brocade02 - 5
Thanks for any help on this.

Grtz

Willem
Nagios XI 5.8.1
https://outsideit.net
bdgoecke
Posts: 36
Joined: Wed Oct 22, 2014 3:41 pm

Re: SNMP Traps coming in as experimental

Post by bdgoecke »

Could we see your /etc/snmp/snmptt.conf ?

And what were the MIB's you loaded ?
Could you post a link to the MIB's ?

Thanks.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: SNMP Traps coming in as experimental

Post by Box293 »

This is good.
WillemDH wrote:In the snmptt.log I can see the traps coming in
WillemDH wrote:but the description of the trap is always "5 / experimental.94.1.8.1.4.16.0.0.39.248.11.151.208.0.0.0.0.0.0.0.0.3 ():5"
I can explain what is happening here, there are two different things I need to explain:

1)
This data comprises of two parts, the / is the separator between the two:
5 = The "data", the value received
experimental.94.1.8.1.4.16.0.0.39.248.11.151.208.0.0.0.0.0.0.0.0.3 ():5 = The Perfdata received from snmptraphandling.py.

Normally perfdata follows a | symbol but I believe with the sporadity of SNMP trap data, putting this into RRDs would be pointless as it could be days/weeks/months before a trap is received. I believe this is why a / is used.

snmptraphandling.py is the script used by the EXEC statement in your SNMPTT EVENT and it is responsible for submitting the Passive Service Check Result to Nagios.

2)
experimental.94.1.8.1.4.16.0.0.39.248.11.151.208.0.0.0.0.0.0.0.0.3 ():5
This has to do with SNMPTT not performing a correct translation of the OID received in the trap. This can happen even when the MIBs are correctly loaded.

To enable full OID translation run these three commands:

Code: Select all

sed -i 's/.*mibs_environment.*/mibs_environment = ALL/g' /etc/snmp/snmptt.ini
sed -i 's/.*translate_integers.*/translate_integers = 0/g' /etc/snmp/snmptt.ini
service snmptt restart
Then send a trap and check what the status appears as in Nagios.


I am in the process of finishing up a tutorial on SNMP Traps which, it should be released soon.


If you are still having problems make sure you email through the info bdgoecke requested.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

Re: SNMP Traps coming in as experimental

Post by WillemDH »

Troy, Bdoecke,

Thanks for the help.

I did what you asked:

Code: Select all

 sed -i 's/.*mibs_environment.*/mibs_environment = ALL/g' /etc/snmp/snmptt.ini
# sed -i 's/.*translate_integers.*/translate_integers = 0/g' /etc/snmp/snmptt.ini
The sent trap does look different now:

Code: Select all

3 / connUnitSensorStatus.16.0.0.39.248.11.151.208.0.0.0.0.0.0.0.0.1 (INTEGER):3
But still not completely what I would have suspected. As this is just a test trap we created by setting the low temperature threshold above the current temperature, we would at least want to see sth like temperature threshold reached.

The mibs I imported where:
ENTITY_RFC2737.mib
BRCD_REG.mib
BRCD_TC.mib
SW.mib
FA.mib
HA.mib
See attached files: azip with all the above mib and snmptt.conf

Grtz

Willem
You do not have the required permissions to view the files attached to this post.
Nagios XI 5.8.1
https://outsideit.net
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: SNMP Traps coming in as experimental

Post by sreinhardt »

From your included snmptt.conf, I see that line 6287 starts the trap definition you are likely seeing here. However it seems that your device sends far more for the value oid than connUnitSensorStatus. Let's stop snmptt, have a trap or two like this come in, and collect the spooled files.

Code: Select all

service snmptt stop
[send a trap or two causing you issues]
[collect traps from /var/spool/snmptt/]
service snmptt start
Let's see if thats a translation issue, or if your device really is sending that many additional oid specifiers within the trap.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: SNMP Traps coming in as experimental

Post by Box293 »

So here is the event in the snmptt.conf file:

Code: Select all

EVENT connUnitSensorStatusChange .1.3.6.1.3.94.0.5 "Status Events" Normal
FORMAT $*
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "$*"
SDESC

The overall status of the connectivity unit has 
changed.
Recommended severity level (for filtering): alert 
Variables:
  1: connUnitSensorStatus
EDESC
This is the trap definition in FA.mib

Code: Select all

connUnitSensorStatusChange TRAP-TYPE 
        ENTERPRISE fcmgmt 
        VARIABLES { connUnitSensorStatus } 
        DESCRIPTION 
            "The overall status of the connectivity unit has 
            changed.
            Recommended severity level (for filtering): alert" 
        ::= 5  
The variable connUnitSensorStatus links to this definition in FA.mib

Code: Select all

connUnitSensorStatus OBJECT-TYPE
            SYNTAX INTEGER {
                unknown(1),
		    other(2),
                ok(3),      -- the sensor indicates ok
                warning(4), -- the sensor indicates a warning
                failed(5)   -- the sensor indicates failure
            }
            ACCESS read-only
            STATUS mandatory
            DESCRIPTION
                "The status indicated by the sensor."
            ::= { connUnitSensorEntry 4 }
So if you're getting a value of 5 then it is indicating a failure.

Unfortunately it's not a temperature value type of trap, it's more like the sensor status changed.


Also, send through the information Spenser has requested as that will help as well.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

Re: SNMP Traps coming in as experimental

Post by WillemDH »

Hey,

Thanks again. So I stopped snmptt and trap indeed started showing up in the /var/spool/snmptt folder

the content looks like this:

Code: Select all

1421222634
brocadefqdn
UDP: [10.54.xx.xx]:32768->[10.54.xx.xx]
.1.3.6.1.2.1.1.3.0 228:22:32:51.28
.1.3.6.1.6.3.1.1.4.1.0 .1.3.6.1.3.94.0.5
.1.3.6.1.3.94.1.8.1.4.16.0.0.39.248.11.151.208.0.0.0.0.0.0.0.0.2 5
.1.3.6.1.6.3.18.1.3.0 10.54.xx.xx
.1.3.6.1.6.3.18.1.4.0 "nagios"
.1.3.6.1.6.3.1.1.4.3.0 .1.3.6.1.3.94
So what does this means? I jsut want something meaningfull to appear in the trap service. Just "5" with experimental doesn't mean anything to our support team. If it would say something like sensor state changed, I'm already a litle happy... :)

Grtz

Willem
Nagios XI 5.8.1
https://outsideit.net
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: SNMP Traps coming in as experimental

Post by sreinhardt »

OK, so we can see everything is coming in properly as numeric and your system is registering the integer sent. The problem here is two fold, 1 snmptt just relays the information and does not provide context to the value and 2, snmptraphandler.py is not understanding how to do the conversion. In your case, the following are what those integers mean:

Code: Select all

unknown = 1
other = 2
ok = 3     -- the sensor indicates ok
warning = 4 -- the sensor indicates a warning
failed = 5   -- the sensor indicates failure
With that, if we changed your snmptt.conf file that has your translated mib to something like:

Code: Select all

EVENT connUnitSensorStatusChange .1.3.6.1.3.94.0.5 "Status Events" Normal
FORMAT $*
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$1" "$@" "$-*" "$*"
SDESC

The overall status of the connectivity unit has
changed.
Recommended severity level (for filtering): alert
Variables:
  1: connUnitSensorStatus
EDESC
Note the only change is on the EXEC line from $s to $1, which changes it from using the NORMAL keyword on the EVENT line, into using the variable 1 provided by the trap. At this point, we are now sending the integer to snmptraphandling.py, and we can inform it of how to react. Within the script, near the top there is a chain of if else statements that handle the severity. The issue as I see it, is that your response completely conflicts with nagios default values, and since we already include them, it will be tricky to make these pick up properly. Maybe changing "$1" to "$1 alt" and using the additional string to check will work best.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

Re: SNMP Traps coming in as experimental

Post by WillemDH »

hey Spenser,

Thanks for the help. Are you saying these trap sent by a Broade SAN switch, is different then the 'usual' trap? It seems strange that Nagios has no way of handling this? I'll do some tests with your suggestions during the course of the week.

Grtz

Willem
Nagios XI 5.8.1
https://outsideit.net
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: SNMP Traps coming in as experimental

Post by sreinhardt »

It's actually not different, and its not really that nagios can't handle it. The route a trap takes is like so:

remote device -> snmptrapd -> spooled file -> snmptt -> exec line to snmptraphandling.py -> nagios check result

The trap sent from your device is perfectly fine, but they choose to use integers 0-5 which nagios also uses. Unfortunately the implementation they have chosen conflicts with ours. The conflict is actually happening in snmptraphandling.py, as if you were to pass the value directly from the trap, when received snmptraphandling would say that the trap is OK per nagios guidelines, not unknown per your mibs settings.

To resolve this, we should only need to tell the snmptt daemon to send more than a numeric status code, and have snmptraphandling interpret that correctly. Does that make a bit more sense? it's an odd situation all around. :)
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Locked