Nagiox XI intermittent issue registering Normal status

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
nfv_nagios
Posts: 35
Joined: Sun Jan 08, 2017 8:00 pm

Nagiox XI intermittent issue registering Normal status

Post by nfv_nagios »

Hi Support,

I am using Nagios XI 5.4.0 on Redhat 2.6.32-642.11.1.el6.x86_64.
SNMP Trap Sender version 1.6.2

I have configured Nagios XI to receive Cisco switch SNMP Traps.
I am able to recieve both LinkDown (Critical severity) and LinkUp (Normal severity) traps for different ports.
All these traps are logged in /var/log/snmptt/snmptt.debug.
However, in the snmptrapsender.log, i can see all the LinkDown events but not all the corresponding LinkUp events.

As you can see below, only LinkUp for interface 4 and 8 is registered in snmptrapsender.log.

After several rounds of testing, i realised it is always either the 2nd or 3rd LinkUp events did not register in snmptrapsender.log. Can you advise how to resolve this?

snmptt.debug

Thu Jun 18 15:40:26 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 4 down down
Thu Jun 18 15:44:20 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 8 down down
Thu Jun 18 15:46:27 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 9 down down
Thu Jun 18 15:50:39 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 4 up up
Thu Jun 18 15:53:15 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 8 up up
Thu Jun 18 15:55:08 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 9 up up


snmptrapsender.log

2020-06-18 15:40:32 - /usr/bin/snmptrap -v 2c -c public 10.10.10.10 '' NAGIOS-NOTIFY-MIB::nSvcNotify nSvcNotifyType s "0" nSvcNotifyNum i 0 nSvcAckAuthor s "0" nSvcAckComment s "0" nHostname s "PL_DQMS_SW1" nHostStateID i 0 nSvcDesc s "SNMP Traps" nSvcStateID i 2 nSvcAttempt i 0 nSvcDurationSec i 0 nSvcGroupName s "0" nSvcLastCheck i 0 nSvcLastChange i 0 nSvcOutput s "Host: 1.1.1.1: SvcDesc: Link down interface 4 is down. Admin state: down. Operational state: down"
2020-06-18 15:44:29 - /usr/bin/snmptrap -v 2c -c public 10.10.10.10 '' NAGIOS-NOTIFY-MIB::nSvcNotify nSvcNotifyType s "0" nSvcNotifyNum i 0 nSvcAckAuthor s "0" nSvcAckComment s "0" nHostname s "PL_DQMS_SW1" nHostStateID i 0 nSvcDesc s "SNMP Traps" nSvcStateID i 2 nSvcAttempt i 0 nSvcDurationSec i 0 nSvcGroupName s "0" nSvcLastCheck i 0 nSvcLastChange i 0 nSvcOutput s "Host: 1.1.1.1: SvcDesc: Link down interface 8 is down. Admin state: down. Operational state: down"
2020-06-18 15:46:33 - /usr/bin/snmptrap -v 2c -c public 10.10.10.10 '' NAGIOS-NOTIFY-MIB::nSvcNotify nSvcNotifyType s "0" nSvcNotifyNum i 0 nSvcAckAuthor s "0" nSvcAckComment s "0" nHostname s "PL_DQMS_SW1" nHostStateID i 0 nSvcDesc s "SNMP Traps" nSvcStateID i 2 nSvcAttempt i 0 nSvcDurationSec i 0 nSvcGroupName s "0" nSvcLastCheck i 0 nSvcLastChange i 0 nSvcOutput s "Host: 1.1.1.1: SvcDesc: Link down interface 9 is down. Admin state: down. Operational state: down"
2020-06-18 15:50:43 - /usr/bin/snmptrap -v 2c -c public 10.10.10.10 '' NAGIOS-NOTIFY-MIB::nSvcNotify nSvcNotifyType s "0" nSvcNotifyNum i 0 nSvcAckAuthor s "0" nSvcAckComment s "0" nHostname s "PL_DQMS_SW1" nHostStateID i 0 nSvcDesc s "SNMP Traps" nSvcStateID i 0 nSvcAttempt i 0 nSvcDurationSec i 0 nSvcGroupName s "0" nSvcLastCheck i 0 nSvcLastChange i 0 nSvcOutput s "Host: 1.1.1.1: SvcDesc: Link on interface 4 is up. Admin state: up. Operational state: up"
2020-06-18 15:53:25 - /usr/bin/snmptrap -v 2c -c public 10.10.10.10 '' NAGIOS-NOTIFY-MIB::nSvcNotify nSvcNotifyType s "0" nSvcNotifyNum i 0 nSvcAckAuthor s "0" nSvcAckComment s "0" nHostname s "PL_DQMS_SW1" nHostStateID i 0 nSvcDesc s "SNMP Traps" nSvcStateID i 0 nSvcAttempt i 0 nSvcDurationSec i 0 nSvcGroupName s "0" nSvcLastCheck i 0 nSvcLastChange i 0 nSvcOutput s "Host: 1.1.1.1: SvcDesc: Link on interface 8 is up. Admin state: up. Operational state: up"

Annotation 2020-06-18 185329.png
You do not have the required permissions to view the files attached to this post.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Nagiox XI intermittent issue registering Normal status

Post by ssax »

Please run this command as root and send me the resulting /tmp/SNMPFILES.zip file:

Code: Select all

zip -r /tmp/SNMPFILES.zip /etc/snmp /usr/share/snmp/mibs
Please PM me a copy of your profile as well, you can download it from Admin > System Profile > Download Profile button.
nfv_nagios
Posts: 35
Joined: Sun Jan 08, 2017 8:00 pm

Re: Nagiox XI intermittent issue registering Normal status

Post by nfv_nagios »

Have PMed you the files.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Nagiox XI intermittent issue registering Normal status

Post by ssax »

I think the problem with this is because it's already in an OK status, the warn/criticals should work if you have is_volatile set but the OKs are not sent once it's already sent the OK results. The assumption is that because you likely submitted them quickly enough to where the status wasn't changed you still see the 2nd OK but will the default settings the second and beyond OKs should not be sent because the event handler is not run unless a state change occurs or it's a problem trap (is_volatile only works for problems).
nfv_nagios
Posts: 35
Joined: Sun Jan 08, 2017 8:00 pm

Re: Nagiox XI intermittent issue registering Normal status

Post by nfv_nagios »

Sorry, don't quite get what you mean.

From snmptt.debug, we can see there are state changes and the OKs came in around 2 mins apart.

snmptt.debug

Thu Jun 18 15:40:26 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 4 down down
Thu Jun 18 15:44:20 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 8 down down
Thu Jun 18 15:46:27 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 9 down down
Thu Jun 18 15:50:39 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 4 up up
Thu Jun 18 15:53:15 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 8 up up
Thu Jun 18 15:55:08 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 9 up up
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Nagiox XI intermittent issue registering Normal status

Post by ssax »

Code: Select all

1. Thu Jun 18 15:40:26 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 4 down down
2. Thu Jun 18 15:44:20 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 8 down down
3. Thu Jun 18 15:46:27 2020 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" 1.1.1.1 - Link down on interface 9 down down
4. Thu Jun 18 15:50:39 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 4 up up
5. Thu Jun 18 15:53:15 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 8 up up
6. Thu Jun 18 15:55:08 2020 .1.3.6.1.6.3.1.1.5.4 Normal "Status Events" 1.1.1.1 - Link up on interface 9 up up
1,2,3, and 4 will send a notification and run an event handler (SNMP Trap Sender is technically an event handler) because is_volatile is set to On.

For the 5th and 6th notifications there is no state change, it's going from an OK to an OK which should not fire the event handler/SNMP Trap Sender because an OK state is a reset.
When Are Event Handlers Executed?

Event handlers are executed when a service or host:

Is in a SOFT problem state
Initially goes into a HARD problem state
Initially recovers from a SOFT or HARD problem state
Taken from here:

https://assets.nagios.com/downloads/nag ... dlers.html

See here as well:

https://assets.nagios.com/downloads/nag ... types.html
nfv_nagios
Posts: 35
Joined: Sun Jan 08, 2017 8:00 pm

Re: Nagiox XI intermittent issue registering Normal status

Post by nfv_nagios »

Thanks for the explanation.

I would like to the event handler to trigger notification for every OK state. Can you advise how can this be done?
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Nagiox XI intermittent issue registering Normal status

Post by cdienger »

To do this you would need to create an individual service in XI for each interface that you want to monitor. The snmptt.conf would also need to be updated to use the MATCH(http://snmptt.sourceforge.net/docs/snmp ... ONF-MATCH_ operator.

https://assets.nagios.com/downloads/nag ... ios_XI.pdf has these examples:

Code: Select all

EVENT linkDown .1.3.6.1.6.3.1.1.5.3 "Status Events" Critical
FORMAT Link down on interface $1. Admin state: $2. Operational state: $3
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "Link
down on interface $1. Admin state: $2. Operational state: $3"
and

Code: Select all

EVENT linkUp .1.3.6.1.6.3.1.1.5.4 "Status Events" Normal
FORMAT Link up on interface $1. Admin state: $2. Operational state: $3
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*"
"Link up on interface $1. Admin state: $2. Operational state: $3"
Which would need to be modified to look something like:

Code: Select all

EVENT linkDown .1.3.6.1.6.3.1.1.5.3 "Status Events" Critical
FORMAT Link down on interface $1. Admin state: $2. Operational state: $3
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps Interface 1" "$s" "$@" "$-*" "Link
down on interface $1. Admin state: $2. Operational state: $3"
MATCH $1: 1
and

Code: Select all

EVENT linkUp .1.3.6.1.6.3.1.1.5.4 "Status Events" Normal
FORMAT Link up on interface $1. Admin state: $2. Operational state: $3
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps Interface 1" "$s" "$@" "$-*"
"Link up on interface $1. Admin state: $2. Operational state: $3"
MATCH $1: 1
A service in XI would then need to be configured with the description "SNMP Traps Interface 1".
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked