Page 1 of 1

Retaining SNMP traps until acknowledgement?

Posted: Wed Aug 09, 2017 6:40 am
by mvndnburg
Hi,

Apologies if this has asked before - I got 400+ pages of hits on 'snmp' and wasn't going to risk carpal tunnel ;)

Running Nagios XI 5.4.4 on RHEL 6.

We have a number of systems sending SNMP traps, for example: Oracle Enterprise Manager, Cisco Tidal, vCenter, etc. etc. What we see is that consecutive traps related to one service on one host, overwrite each other. I understand that this is standard behaviour of Nagios but it can be problematic in case there are multiple alerts that must stay visible until acknowledged.

Example 1: Tidal (a scheduling software) sends traps for job completion status (i.e. succeeded/failed/not started, etc.). All these jobs (we have thousands) reside on the same Tidal host and send traps to the same service 'Tidal SNMP Traps'. In the 'Operations Center' screen, only the latest trap is displayed. This will lead to traps being missed by the monitoring team if they're being received quickly after another.

Example 2: NetApp sends traps about filer status (among other things). By default each trap (and there are 218 in the MIB) uses the same service name 'NetApp SNMP Traps'. This leads to traps 'overwriting' each other in Nagios if they come close together and relate to the same host. See attached snapshot.

Our previous monitoring software displayed all the traps in chronological order until they were manually acknowledged by an engineer. Is is possible to have this behaviour in Nagios as well, without having to resort to the 'Event Log' page, which quickly fills up when hundreds of hosts are monitored?

Re: Retaining SNMP traps until acknowledgement?

Posted: Wed Aug 09, 2017 7:30 am
by mvndnburg
I found this post from 2013.
Has the trap logic and/or Nagios XI processing with regard to traps, changed since?

Re: Retaining SNMP traps until acknowledgement?

Posted: Wed Aug 09, 2017 4:25 pm
by dwhitfield
mvndnburg wrote:Is is possible to have this behaviour in Nagios as well, without having to resort to the 'Event Log' page, which quickly fills up when hundreds of hosts are monitored?
It's not built-in, which I suspect is what you are asking.

As for the Event Log page, there is search, so if the tech knows the device it should be easy enough to find historical info.

Re: Retaining SNMP traps until acknowledgement?

Posted: Thu Aug 10, 2017 4:10 am
by mvndnburg
I think it would be an addition to Nagios to have a screen similar to the Operations Screen, with an automatically refreshed display of incoming SNMP traps, not overwriting previous traps. This is how for example HP Operations Manager displays alerts.

I have played around a bit with the Event Log, filtering on 'SNMP traps' and adding the result as a report to 'my views', 'my reports' or 'tools'. Unfortunately, Event Log does not update itself periodically.

Q. Is there a way to let 'Event Log' update itself?
Q. Can the Event Log be added to a Dashboard, like the host/service groups?

Re: Retaining SNMP traps until acknowledgement?

Posted: Thu Aug 10, 2017 3:26 pm
by tgriep
There is not a way to automatically refresh the Events Log page but, if you add it as a view, it will refresh automatically on whatever the schedule you have set on the Views menu.