Page 2 of 2

Re: Nagios XI SNMP Traps (SNMPTT)(Missing Traps)

Posted: Mon Jun 16, 2014 11:26 am
by sreinhardt
Could you post a screenshot or copy of the state history report for this service, showing both hard and soft states, for the timeperiod that you just showed in the logs? This is mainly just to get human readable timestamps out of the nagios log instead of having to convert them all. Unfortunately it will not show any OK->OK states as that is not a hard or soft state, but that would be the case for any of the logs we have looked at to this point, aside from snmptt.log. The only way to change that would be to make the service volatile and give it some time to collect heartbeats too. This would make any incoming check result stored in the state history logs.

Re: Nagios XI SNMP Traps (SNMPTT)(Missing Traps)

Posted: Fri Jul 25, 2014 1:49 pm
by mlopez
Hi Spenser,
I think I figured out the issue but I need your help!!! BTW sorry for the delay in response as I was side tracked on another project.

Ok, you remember how "heartbeat" with a "Normal" status was showing up in snmptt.log but not always in nagios.log well I did something out of the box changes, I changed the criticality in my mib "/usr/share/snmp/mibs/processed_mibs/notification-mib" to Warning and guess what! heartbeat was showing up every time.

This means that there is something going on with snmptt or my event handler which is /usr/local/bin/snmptraphandling.py that when the state is "OK" it will not forward any new states unless the state changes, I would need it to forward all the time even if OK. BTW this is not only this service but it is doing it on all services as I changed a highly active snmptrap which when in Warning it would grab it all the time but when OK it only received it once in nagios.log.

Example:
heartbeat OK <---- THIS IS SENT TO NAGIOS
heartbeat OK <----- THIS IS NOT SENT TO NAGIOS
heartbeat OK <----- THIS IS NOT SENT TO NAGIOS
heartbeat OK <----- THIS IS NOT SENT TO NAGIOS
heartbeat OK <----- THIS IS NOT SENT TO NAGIOS
heartbeat CRITICAL <---- NAGIOS PROCESSES nagios.log for SNMP TRAP but does not detect new "heartbeat traps" X hours and changes state to CRITICAL
heatbeat OK <----- THIS IS SENT TO NAGIOS


snmptt trap version: snmptt-1.4-0.9.beta2.el6.noarch

I'm wondering if you could help me with this.

I've attached my snmptt.ini and my snmptraphandling.py

Code: Select all

EVENT heartbeatNotify .1.3.6.1.4.1.x.100.3.1 "Status Events" Warning
FORMAT $4 heartbeat $7 = $8 $5
EXEC /usr/local/bin/snmptraphandling.py "$1" "heartbeatNotify" "$s" "$@" "$-*" "$1 $7 $8"
SDESC
This is the system's heartbeat.
Variables:
  1: smsSystemSn
  2: smsSystemName
  3: smsSystemLocation
  4: smsSystemDescription
  5: smsSystemTime
  6: smsOrigin
  7: smsObjectName
  8: smsObjectValue
EDESC

Re: Nagios XI SNMP Traps (SNMPTT)(Missing Traps)

Posted: Fri Jul 25, 2014 1:56 pm
by sreinhardt
We should enable debugging on snmptt and revert the warning back to OK. reason I say this, then we can see when all traps come in(snmptt.log) and we can see when and how they were execed (snmptt.debug). I am wondering if your service is not set as isvolatile and so nagios is just going "hey great it's still OK, nothing to see here". I would still expect it in the nagios log, but not in an event log or state change report. Could you send me an example of the snmptt and nagios logs as you have it now, then we can get you setup for debugging and capture some more to compare.