Retaining SNMP traps until acknowledgement?
Posted: Wed Aug 09, 2017 6:40 am
Hi,
Apologies if this has asked before - I got 400+ pages of hits on 'snmp' and wasn't going to risk carpal tunnel
Running Nagios XI 5.4.4 on RHEL 6.
We have a number of systems sending SNMP traps, for example: Oracle Enterprise Manager, Cisco Tidal, vCenter, etc. etc. What we see is that consecutive traps related to one service on one host, overwrite each other. I understand that this is standard behaviour of Nagios but it can be problematic in case there are multiple alerts that must stay visible until acknowledged.
Example 1: Tidal (a scheduling software) sends traps for job completion status (i.e. succeeded/failed/not started, etc.). All these jobs (we have thousands) reside on the same Tidal host and send traps to the same service 'Tidal SNMP Traps'. In the 'Operations Center' screen, only the latest trap is displayed. This will lead to traps being missed by the monitoring team if they're being received quickly after another.
Example 2: NetApp sends traps about filer status (among other things). By default each trap (and there are 218 in the MIB) uses the same service name 'NetApp SNMP Traps'. This leads to traps 'overwriting' each other in Nagios if they come close together and relate to the same host. See attached snapshot.
Our previous monitoring software displayed all the traps in chronological order until they were manually acknowledged by an engineer. Is is possible to have this behaviour in Nagios as well, without having to resort to the 'Event Log' page, which quickly fills up when hundreds of hosts are monitored?
Apologies if this has asked before - I got 400+ pages of hits on 'snmp' and wasn't going to risk carpal tunnel
Running Nagios XI 5.4.4 on RHEL 6.
We have a number of systems sending SNMP traps, for example: Oracle Enterprise Manager, Cisco Tidal, vCenter, etc. etc. What we see is that consecutive traps related to one service on one host, overwrite each other. I understand that this is standard behaviour of Nagios but it can be problematic in case there are multiple alerts that must stay visible until acknowledged.
Example 1: Tidal (a scheduling software) sends traps for job completion status (i.e. succeeded/failed/not started, etc.). All these jobs (we have thousands) reside on the same Tidal host and send traps to the same service 'Tidal SNMP Traps'. In the 'Operations Center' screen, only the latest trap is displayed. This will lead to traps being missed by the monitoring team if they're being received quickly after another.
Example 2: NetApp sends traps about filer status (among other things). By default each trap (and there are 218 in the MIB) uses the same service name 'NetApp SNMP Traps'. This leads to traps 'overwriting' each other in Nagios if they come close together and relate to the same host. See attached snapshot.
Our previous monitoring software displayed all the traps in chronological order until they were manually acknowledged by an engineer. Is is possible to have this behaviour in Nagios as well, without having to resort to the 'Event Log' page, which quickly fills up when hundreds of hosts are monitored?