SNMP Traps are clearing live alarms when they shouldnt

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
vijilants
Posts: 215
Joined: Wed Jun 12, 2013 2:50 pm

SNMP Traps are clearing live alarms when they shouldnt

Post by vijilants »

Nagios XI version: 5.8.3
Release info: 3.10.0-957.5.1.el7.x86_64 x86_64
CentOS Linux release 7.6.1810 (Core)
Gnome is not installed

Hi,

We have an urgent issue on our system where a router / switch will send out a critical SNMP trap . This correctly raises a major alar on the GUI.

However when the next unrelated trap such as a configuration notification trap is recieved from the same device which is classed as a "Normal" trap, this clears the existing critical alarm on the GUI.

Can you please advise as to what is causing this as we are losing alarms.

see below for smnptt trap snippet of when this happened last...

Code: Select all

Wed Jun  2 17:00:55 2021 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" SMTDBPISCMWR01 - A linkDown trap signifies that the SNMP entity, acting in 2 GigabitEthernet0/1 ethernetCsmacd down
Wed Jun  2 17:00:55 2021 .1.3.6.1.6.3.1.1.5.3 Critical "Status Events" SMTDBPISCMWR01 - A linkDown trap signifies that the SNMP entity, acting in 2 GigabitEthernet0/1 ethernetCsmacd down
Wed Jun  2 17:03:33 2021 .1.3.6.1.4.1.9.9.43.2.0.1 Normal "Status Events" SMTDBPISCMWR01 - Notification of a configuration management event as 1 3 2
Wed Jun  2 17:03:37 2021 .1.3.6.1.4.1.9.9.43.2.0.1 Normal "Status Events" SMTDBPISCMWR01 - Notification of a configuration management event as 1 2 3
Wed Jun  2 17:06:01 2021 .1.3.6.1.4.1.9.9.43.2.0.1 Normal "Status Events" SMTDBPISCMWR01 - Notification of a configuration management event as 1 2 3
Wed Jun  2 17:09:31 2021 .1.3.6.1.4.1.9.9.43.2.0.1 Normal "Status Events" SMTDBPISCMWR01 - Notification of a configuration management event as 1 3 2
Thu Jun  3 06:53:10 2021 .1.3.6.1.4.1.9.9.43.2.0.1 Normal "Status Events" SMTDBPISCMWR01 - Notification of a configuration management event as 1 3 2
[root@snmptt]# 


Attached is an image of the GUI showing the alarm cleared.....there are no critical alarms for this device on the main services page even though the alarm is still active.

Note this is happening on every device. 1) a critical alarm is generated by an SNMP trap ...2) another normal stat unassociated status trap arrives from the same device, and the result is the initial GUI alarm is cleared.
You do not have the required permissions to view the files attached to this post.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: SNMP Traps are clearing live alarms when they shouldnt

Post by ssax »

Because they are coming into the same service description that will continue to happen. Generally, you'd want to pay attention to the notifications that you receive as the alerts can get reset in cases like this and the notifications are the only things that will show them outside of looking at the State History report.

An alternative would be to setup separate services for the different traps, that's the only way around it.
vijilants
Posts: 215
Joined: Wed Jun 12, 2013 2:50 pm

Re: SNMP Traps are clearing live alarms when they shouldnt

Post by vijilants »

ssax wrote:Because they are coming into the same service description that will continue to happen. Generally, you'd want to pay attention to the notifications that you receive as the alerts can get reset in cases like this and the notifications are the only things that will show them outside of looking at the State History report.

An alternative would be to setup separate services for the different traps, that's the only way around it.
This behaviour doesn't happen on our other system and I don't uderstand as to why it should happen. Also this system has been in place for several years and this didn't happen before. I can send you the SNMPTT log for exactly the same device and the State history report for the same device to show you that the other system reacts correctly.

If you get an SNMP trap for a specific alarm, it should not be cleared unless to get a "Normal" SNMP trap for the associated event.....if this wasn't the case there would be no point in going forward with this system for mib related SNMP alarming because every time something else comes through with a Normal alarm, the previous unrelated alarm would be cleared, so using the system to monitor network routing components via SNMP traps would be pointless.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: SNMP Traps are clearing live alarms when they shouldnt

Post by ssax »

Does the other system have separate services for different traps? What I mean is that do traps with different OIDs still show on the other XI server under a single service called "SNMP Traps"?

If so, without specifying different service names for each trap definition (to differentiate the traps) here's what should always occur based on the Nagios Core functionality:
- linkDown trap received - "SNMP Traps" service set to Critical with linkDown trap information put into the plugin_output (Status Information column in the GUI)
- Config Management trap received - "SNMP Traps" service set to Ok with Config Management trap information put into the plugin_output (Status Information column in the GUI)

If that second trap is going to the same service description it will always change the state/output of it because in Nagios's eyes they are a single service.

The only way that I know of to get around that would be to modify the traps and change the Service Description to something else so that they go to a different service like by modifying the linkUp and linkDown traps and changing both of them to go to the "SNMP Traps Link" service.

That is currently how the functionality works.
vijilants
Posts: 215
Joined: Wed Jun 12, 2013 2:50 pm

Re: SNMP Traps are clearing live alarms when they shouldnt

Post by vijilants »

The other system does not have separate services for separate traps but behaves correctly.

So can you please explain as to how we go about sorting this out. We are managing in excess of 150 routers on the network, and if this cannot be resolved we will have to look at other systems.

Thanks
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: SNMP Traps are clearing live alarms when they shouldnt

Post by ssax »

The other system does not have separate services for separate traps but behaves correctly.
I would manually submit traps to test this out as I don't see how it could differ in functionality. What version of XI is the other system running? What version of Core is it running?

Code: Select all

/usr/local/nagios/bin/nagios -V
I have reached out to QA/dev to look this over to make sure I'm not missing anything, I will let you know what they say.

EDIT: Please PM me copy of your profile.zip from that other system so I can review the configuration.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: SNMP Traps are clearing live alarms when they shouldnt

Post by ssax »

I talked with development on this and they confirmed that I'm understanding the current functionality correctly on this. The only way around it would be to create separate services per trap type. You could utilize the API to create the services programmatically but the only way to differentiate the traps would be with different service descriptions on them so that the other traps don't impact them. You can see the API information under the Help menu in XI for working examples.
vijilants
Posts: 215
Joined: Wed Jun 12, 2013 2:50 pm

Re: SNMP Traps are clearing live alarms when they shouldnt

Post by vijilants »

Both systems are runnning the same versions of XI and the core on both are 4.4.6

Can I not simply hash out the Configuration Notification mib entries in the snmptt.conf file ? That way the trap I presume would still be in ine snmptt.log file but would not be passed on to the GUI ?
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: SNMP Traps are clearing live alarms when they shouldnt

Post by ssax »

You would need to be more specific on what you mean by "Can I not simply hash out the Configuration Notification mib entries in the snmptt.conf file".

I assumed they were already in the snmptt.log file, if you're managing them through the snmptt.conf file as long as they don't have this line on them, it will not submit them into nagios but should still be in the snmptt.log:

Code: Select all

EXEC /usr/local/bin/snmptraphandling.py "$aR" "SNMP Traps" "$s" "$@" "$-*" "A xxxx trap signifies that the SNMP entity, acting in an $*"
Only the EXEC lines on the traps are what cause them to be submitted to nagios (and also where you would change SNMP Traps to some other service name if you wanted separate services which would show up in Admin > Unconfigured Objects if you didn't already create a service with the same name).
vijilants
Posts: 215
Joined: Wed Jun 12, 2013 2:50 pm

Re: SNMP Traps are clearing live alarms when they shouldnt

Post by vijilants »

This is what is in the snmptt.conf which is causing the alarm to clear.

Code: Select all

MIB: CISCO-CONFIG-MAN-MIB (file:./CISCO-CONFIG-MAN-MIB.my) converted on Tue May 26 07:04:41 2020 using snmpttconvertmib v1.3
#
#
#
EVENT ciscoConfigManEvent .1.3.6.1.4.1.9.9.43.2.0.1 "Status Events" Normal
FORMAT Notification of a configuration management event as $*
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "Notification of a configuration management event as $*"
SDESC
Notification of a configuration management event as
recorded in ccmHistoryEventTable.
Variables:
  1: ccmHistoryEventCommandSource
  2: ccmHistoryEventConfigSource
  3: ccmHistoryEventConfigDestination
EDESC
#
#
#
EVENT ccmCLIRunningConfigChanged .1.3.6.1.4.1.9.9.43.2.0.2 "Status Events" Normal
FORMAT This notification indicates that the running $*
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "This notification indicates that the running $*"
SDESC
This notification indicates that the running
configuration of the managed system has changed
from the CLI.
If the managed system supports a separate
configuration mode(where the configuration commands
are entered under a  configuration session which
affects the running configuration of the system),
then this notification is sent when the configuration
mode is exited.
During this configuration session there can be
one or more running configuration changes.
Variables:
  1: ccmHistoryRunningLastChanged
  2: ccmHistoryEventTerminalType
EDESC
#
#
#
EVENT ccmCTIDRolledOver .1.3.6.1.4.1.9.9.43.2.0.3 "Status Events" Normal
FORMAT This notification indicates that the Config Change Tracking $*
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "This notification indicates that the Config Change Tracking $*"
SDESC
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "Notification of a configuration management event as $*"
SDESC
Notification of a configuration management event as
recorded in ccmHistoryEventTable.
Variables:
  1: ccmHistoryEventCommandSource
  2: ccmHistoryEventConfigSource
  3: ccmHistoryEventConfigDestination
EDESC
#
#
#
EVENT ccmCLIRunningConfigChanged .1.3.6.1.4.1.9.9.43.2.0.2 "Status Events" Normal
FORMAT This notification indicates that the running $*
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "This notification indicates that the running $*"
SDESC
This notification indicates that the running
configuration of the managed system has changed
from the CLI.
If the managed system supports a separate
configuration mode(where the configuration commands
are entered under a  configuration session which
affects the running configuration of the system),
then this notification is sent when the configuration
mode is exited.
During this configuration session there can be
one or more running configuration changes.
Variables:
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "Notification of a configuration management event as $*"
SDESC
Notification of a configuration management event as
recorded in ccmHistoryEventTable.
Variables:
  1: ccmHistoryEventCommandSource
  2: ccmHistoryEventConfigSource
  3: ccmHistoryEventConfigDestination
EDESC
#
#
#
EVENT ccmCLIRunningConfigChanged .1.3.6.1.4.1.9.9.43.2.0.2 "Status Events" Normal
FORMAT This notification indicates that the running $*
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "This notification indicates that the running $*"
SDESC
This notification indicates that the running
configuration of the managed system has changed
from the CLI.
If the managed system supports a separate
configuration mode(where the configuration commands
are entered under a  configuration session which
affects the running configuration of the system),
then this notification is sent when the configuration
mode is exited.
During this configuration session there can be
one or more running configuration changes.
Variables:
  1: ccmHistoryRunningLastChanged
  2: ccmHistoryEventTerminalType
EDESC
#
#
#
EVENT ccmCTIDRolledOver .1.3.6.1.4.1.9.9.43.2.0.3 "Status Events" Normal
FORMAT This notification indicates that the Config Change Tracking $*
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "This notification indicates that the Config Change Tracking $*"
SDESC
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "Notification of a configuration management event as $*"
SDESC
Notification of a configuration management event as
recorded in ccmHistoryEventTable.
Variables:
  1: ccmHistoryEventCommandSource
  2: ccmHistoryEventConfigSource
  3: ccmHistoryEventConfigDestination
EDESC
#
#
#
EVENT ccmCLIRunningConfigChanged .1.3.6.1.4.1.9.9.43.2.0.2 "Status Events" Normal
FORMAT This notification indicates that the running $*
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "This notification indicates that the running $*"
SDESC
This notification indicates that the running
configuration of the managed system has changed
from the CLI.
If the managed system supports a separate
configuration mode(where the configuration commands
are entered under a  configuration session which
affects the running configuration of the system),
then this notification is sent when the configuration
mode is exited.
During this configuration session there can be
one or more running configuration changes.
Variables:
  1: ccmHistoryRunningLastChanged
  2: ccmHistoryEventTerminalType
EDESC
#
#
#
EVENT ccmCTIDRolledOver .1.3.6.1.4.1.9.9.43.2.0.3 "Status Events" Normal
FORMAT This notification indicates that the Config Change Tracking $*
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "This notification indicates that the Config Change Tracking $*"
SDESC
This notification indicates that the Config Change Tracking
ID has rolled over and will be reset.
Variables:
EDESC
#
#


Which part do I need to hash out to stop it being sent to the GUI ?

Also if II took the approack of setting up a different service for these Normal traps I presume then that they will not interfere with any other incoming traps ? If so can you please walk me through the process setting up another service for these traps as I am unsure as to how to do that.

Thanks
Locked