Page 1 of 1
SNMP Trap - notification delay
Posted: Thu Jul 04, 2013 6:32 am
by kiklop
Hi everyone,
I have a problem with my nagios environment. I use SNMP traps for APC UPS units and sometimes APC unit generates more traps in a short period of a time. For example 10 critical traps in one minute so I receive 10 critical notifications.
I would like to suppress this behaviour and would like to see only one notification. I tried to setup First notification delay to a 5min but it has probably no effect when using passive checks.
Do you have any idea how to fix this? UPS unit doesn`t have an option how to manage traps. So I think that I need to put "some logic" between snmptt and nagios.
Thanks for all ideas !!
Re: SNMP Trap - notification delay
Posted: Mon Jul 08, 2013 11:50 am
by sreinhardt
You could alter the severity of traps sent from apc in the snmptt.conf file. This would allow you to set OK or warning instead of critical. Also can you confirm that you do not have isVolatile set to 1 for the snmptrap service?
Re: SNMP Trap - notification delay
Posted: Tue Jul 09, 2013 2:08 am
by kiklop
Thanks for reply !
Yes, I know that I can modify the severity, but in this case it was a critical event (Batteries discharged).
isVolatile is disabled - already checked this.
Problem is, that UPS sends critical trap, then OK, Crit, OK.. so it looks that service is flapping but I dont want miss some other traps.
I would like to have something like this.. When new trap is received, script or something will look somewhere and look for a last trap with the same severity. If is the same, throw away, if its different send notif..
Re: SNMP Trap - notification delay
Posted: Tue Jul 09, 2013 1:23 pm
by sreinhardt
Ah I see, so the constant OK\fail state is what is causing all the notifications. I do not know of a script that does what you are describing, but can see how it would he helpful. You would likely need to keep a temporary file for the last 5-10 events and possibly timestamps to compare against. From there you could just alter the snmptt.conf EVENT lines that would normally send to nagios.
Is it possible to alter the alerts on the UPS side? say rate limiting them or something?
Re: SNMP Trap - notification delay
Posted: Thu Jul 11, 2013 9:35 am
by kiklop
OK,
now it is more "stable". I receive critical trap every 2 minutes. and every 20minutes I receive OK trap. Yes, I know that I need to replace batteries or check this unit
but anyway, I would like to find a solution, how to control duplicate traps.. Is there any way for example how to store all traps in MySQL database and then filter them and push to nagios? - for example using cron script every minute..
snmp traps log:
Code: Select all
2013-07-11 16:05:53 10.10.0.1(via UDP: [10.10.0.1]:161->[10.10.0.12]) TRAP, SNMP v1, community public
PowerNet-MIB::apc Enterprise Specific Trap (PowerNet-MIB::upsDischarged) Uptime: 281 days, 4:33:58.60
PowerNet-MIB::mtrapargsString.0 = STRING: "UPS: Batteries discharged."
2013-07-11 16:07:54 10.10.0.1(via UDP: [10.10.0.1]:161->[10.10.0.12]) TRAP, SNMP v1, community public
PowerNet-MIB::apc Enterprise Specific Trap (PowerNet-MIB::upsDischarged) Uptime: 281 days, 4:35:59.32
PowerNet-MIB::mtrapargsString.0 = STRING: "UPS: Batteries discharged."
2013-07-11 16:09:55 10.10.0.1(via UDP: [10.10.0.1]:161->[10.10.0.12]) TRAP, SNMP v1, community public
PowerNet-MIB::apc Enterprise Specific Trap (PowerNet-MIB::upsDischarged) Uptime: 281 days, 4:38:00.05
PowerNet-MIB::mtrapargsString.0 = STRING: "UPS: Batteries discharged."
2013-07-11 16:11:55 10.10.0.1(via UDP: [10.10.0.1]:161->[10.10.0.12]) TRAP, SNMP v1, community public
PowerNet-MIB::apc Enterprise Specific Trap (PowerNet-MIB::upsDischarged) Uptime: 281 days, 4:40:00.77
PowerNet-MIB::mtrapargsString.0 = STRING: "UPS: Batteries discharged."
2013-07-11 16:13:56 10.10.0.1(via UDP: [10.10.0.1]:161->[10.10.0.12]) TRAP, SNMP v1, community public
PowerNet-MIB::apc Enterprise Specific Trap (PowerNet-MIB::upsDischarged) Uptime: 281 days, 4:42:01.51
PowerNet-MIB::mtrapargsString.0 = STRING: "UPS: Batteries discharged."
Re: SNMP Trap - notification delay
Posted: Thu Jul 11, 2013 1:00 pm
by sreinhardt
This is almost completely out of the scope of support, as we don't tend to create custom scripts and such. Likely what your best option is, since even getting it in a flapping state will not resolve you not getting other traps coming in. Instead I might suggest altering a copy of the snmptraphandler.py to suite your needs. Something along the lines of:
when the trap is submitted it checks a temp file.
the temp file has a time\date stamp
if that stamp is not more than 1 day old no event is submitted
if it is older than a day submit and rewrite temp file with new time.
I would suggest only doing this for the single trap regarding batteries. We can give you some suggestions regarding this, but much of the work needs to be done by you.