Page 1 of 1

Nagiox XI intermittent issue registering Normal status

Posted: Thu Jun 18, 2020 6:10 am
by nfv_nagios
Hi Support,

I am using Nagios XI 5.4.0 on Redhat 2.6.32-642.11.1.el6.x86_64.
SNMP Trap Sender version 1.6.2

I have configured Nagios XI to receive Cisco switch SNMP Traps.
I am able to recieve both LinkDown (Critical severity) and LinkUp (Normal severity) traps for different ports.
All these traps are logged in /var/log/snmptt/snmptt.debug.
However, in the snmptrapsender.log, i can see all the LinkDown events but not all the corresponding LinkUp events.

As you can see below, only LinkUp for interface 4 and 8 is registered in snmptrapsender.log.

After several rounds of testing, i realised it is always either the 2nd or 3rd LinkUp events did not register in snmptrapsender.log. Can you advise how to resolve this?

snmptt.debug

Thu Jun 18 15:40:26 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 4 down down
Thu Jun 18 15:44:20 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 8 down down
Thu Jun 18 15:46:27 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 9 down down
Thu Jun 18 15:50:39 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 4 up up
Thu Jun 18 15:53:15 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 8 up up
Thu Jun 18 15:55:08 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 9 up up


snmptrapsender.log

2020-06-18 15:40:32 - /usr/bin/snmptrap -v 2c -c public 10.10.10.10 '' NAGIOS-NOTIFY-MIB::nSvcNotify nSvcNotifyType s "0" nSvcNotifyNum i 0 nSvcAckAuthor s "0" nSvcAckComment s "0" nHostname s "PL_DQMS_SW1" nHostStateID i 0 nSvcDesc s "SNMP Traps" nSvcStateID i 2 nSvcAttempt i 0 nSvcDurationSec i 0 nSvcGroupName s "0" nSvcLastCheck i 0 nSvcLastChange i 0 nSvcOutput s "Host: 1.1.1.1: SvcDesc: Link down interface 4 is down. Admin state: down. Operational state: down"
2020-06-18 15:44:29 - /usr/bin/snmptrap -v 2c -c public 10.10.10.10 '' NAGIOS-NOTIFY-MIB::nSvcNotify nSvcNotifyType s "0" nSvcNotifyNum i 0 nSvcAckAuthor s "0" nSvcAckComment s "0" nHostname s "PL_DQMS_SW1" nHostStateID i 0 nSvcDesc s "SNMP Traps" nSvcStateID i 2 nSvcAttempt i 0 nSvcDurationSec i 0 nSvcGroupName s "0" nSvcLastCheck i 0 nSvcLastChange i 0 nSvcOutput s "Host: 1.1.1.1: SvcDesc: Link down interface 8 is down. Admin state: down. Operational state: down"
2020-06-18 15:46:33 - /usr/bin/snmptrap -v 2c -c public 10.10.10.10 '' NAGIOS-NOTIFY-MIB::nSvcNotify nSvcNotifyType s "0" nSvcNotifyNum i 0 nSvcAckAuthor s "0" nSvcAckComment s "0" nHostname s "PL_DQMS_SW1" nHostStateID i 0 nSvcDesc s "SNMP Traps" nSvcStateID i 2 nSvcAttempt i 0 nSvcDurationSec i 0 nSvcGroupName s "0" nSvcLastCheck i 0 nSvcLastChange i 0 nSvcOutput s "Host: 1.1.1.1: SvcDesc: Link down interface 9 is down. Admin state: down. Operational state: down"
2020-06-18 15:50:43 - /usr/bin/snmptrap -v 2c -c public 10.10.10.10 '' NAGIOS-NOTIFY-MIB::nSvcNotify nSvcNotifyType s "0" nSvcNotifyNum i 0 nSvcAckAuthor s "0" nSvcAckComment s "0" nHostname s "PL_DQMS_SW1" nHostStateID i 0 nSvcDesc s "SNMP Traps" nSvcStateID i 0 nSvcAttempt i 0 nSvcDurationSec i 0 nSvcGroupName s "0" nSvcLastCheck i 0 nSvcLastChange i 0 nSvcOutput s "Host: 1.1.1.1: SvcDesc: Link on interface 4 is up. Admin state: up. Operational state: up"
2020-06-18 15:53:25 - /usr/bin/snmptrap -v 2c -c public 10.10.10.10 '' NAGIOS-NOTIFY-MIB::nSvcNotify nSvcNotifyType s "0" nSvcNotifyNum i 0 nSvcAckAuthor s "0" nSvcAckComment s "0" nHostname s "PL_DQMS_SW1" nHostStateID i 0 nSvcDesc s "SNMP Traps" nSvcStateID i 0 nSvcAttempt i 0 nSvcDurationSec i 0 nSvcGroupName s "0" nSvcLastCheck i 0 nSvcLastChange i 0 nSvcOutput s "Host: 1.1.1.1: SvcDesc: Link on interface 8 is up. Admin state: up. Operational state: up"

Annotation 2020-06-18 185329.png

Re: Nagiox XI intermittent issue registering Normal status

Posted: Thu Jun 18, 2020 4:20 pm
by ssax
Please run this command as root and send me the resulting /tmp/SNMPFILES.zip file:

Code: Select all

zip -r /tmp/SNMPFILES.zip /etc/snmp /usr/share/snmp/mibs
Please PM me a copy of your profile as well, you can download it from Admin > System Profile > Download Profile button.

Re: Nagiox XI intermittent issue registering Normal status

Posted: Fri Jun 19, 2020 2:21 am
by nfv_nagios
Have PMed you the files.

Re: Nagiox XI intermittent issue registering Normal status

Posted: Fri Jun 19, 2020 4:29 pm
by ssax
I think the problem with this is because it's already in an OK status, the warn/criticals should work if you have is_volatile set but the OKs are not sent once it's already sent the OK results. The assumption is that because you likely submitted them quickly enough to where the status wasn't changed you still see the 2nd OK but will the default settings the second and beyond OKs should not be sent because the event handler is not run unless a state change occurs or it's a problem trap (is_volatile only works for problems).

Re: Nagiox XI intermittent issue registering Normal status

Posted: Sun Jun 21, 2020 10:45 pm
by nfv_nagios
Sorry, don't quite get what you mean.

From snmptt.debug, we can see there are state changes and the OKs came in around 2 mins apart.

snmptt.debug

Thu Jun 18 15:40:26 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 4 down down
Thu Jun 18 15:44:20 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 8 down down
Thu Jun 18 15:46:27 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 9 down down
Thu Jun 18 15:50:39 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 4 up up
Thu Jun 18 15:53:15 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 8 up up
Thu Jun 18 15:55:08 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 9 up up

Re: Nagiox XI intermittent issue registering Normal status

Posted: Mon Jun 22, 2020 5:30 pm
by ssax

Code: Select all

1. Thu Jun 18 15:40:26 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 4 down down
2. Thu Jun 18 15:44:20 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 8 down down
3. Thu Jun 18 15:46:27 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 9 down down
4. Thu Jun 18 15:50:39 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 4 up up
5. Thu Jun 18 15:53:15 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 8 up up
6. Thu Jun 18 15:55:08 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 9 up up
1,2,3, and 4 will send a notification and run an event handler (SNMP Trap Sender is technically an event handler) because is_volatile is set to On.

For the 5th and 6th notifications there is no state change, it's going from an OK to an OK which should not fire the event handler/SNMP Trap Sender because an OK state is a reset.
When Are Event Handlers Executed?

Event handlers are executed when a service or host:

Is in a SOFT problem state
Initially goes into a HARD problem state
Initially recovers from a SOFT or HARD problem state
Taken from here:

https://assets.nagios.com/downloads/nag ... dlers.html

See here as well:

https://assets.nagios.com/downloads/nag ... types.html

Re: Nagiox XI intermittent issue registering Normal status

Posted: Mon Jun 22, 2020 10:00 pm
by nfv_nagios
Thanks for the explanation.

I would like to the event handler to trigger notification for every OK state. Can you advise how can this be done?

Re: Nagiox XI intermittent issue registering Normal status

Posted: Tue Jun 23, 2020 4:54 pm
by cdienger
To do this you would need to create an individual service in XI for each interface that you want to monitor. The snmptt.conf would also need to be updated to use the MATCH(http://snmptt.sourceforge.net/docs/snmp ... ONF-MATCH_ operator.

https://assets.nagios.com/downloads/nag ... ios_XI.pdf has these examples:

Code: Select all

EVENT linkDown .1.3.6.1.6.3.1.1.5.3 "Status Events" Critical
FORMAT Link down on interface $1. Admin state: $2. Operational state: $3
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "Link
down on interface $1. Admin state: $2. Operational state: $3"
and

Code: Select all

EVENT linkUp .1.3.6.1.6.3.1.1.5.4 "Status Events" Normal
FORMAT Link up on interface $1. Admin state: $2. Operational state: $3
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*"
"Link up on interface $1. Admin state: $2. Operational state: $3"
Which would need to be modified to look something like:

Code: Select all

EVENT linkDown .1.3.6.1.6.3.1.1.5.3 "Status Events" Critical
FORMAT Link down on interface $1. Admin state: $2. Operational state: $3
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps Interface 1" "$s" "$@" "$-*" "Link
down on interface $1. Admin state: $2. Operational state: $3"
MATCH $1: 1
and

Code: Select all

EVENT linkUp .1.3.6.1.6.3.1.1.5.4 "Status Events" Normal
FORMAT Link up on interface $1. Admin state: $2. Operational state: $3
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps Interface 1" "$s" "$@" "$-*"
"Link up on interface $1. Admin state: $2. Operational state: $3"
MATCH $1: 1
A service in XI would then need to be configured with the description "SNMP Traps Interface 1".