SNMP traps being received but not updating in Nagios

Post by **snapon_admin** » Thu Sep 07, 2017 10:31 am

I'm seeing a ton of traps coming into my Nagios server and my snmptt.log and snmpttunknown.log files are updating with all of the new messages coming in, but nothing is changing on the GUI side of things. Any thoughts on why that might be? To test this I added a host that was showing up in unconfigured objects, it's a firewall I believe and we're not monitoring traps on it but it is configured to send them to Nagios for some reason. This is snmptt.log output from 5 minutes ago:

Code: Select all

Thu Sep  7 10:25:56 2017 .1.3.6.1.6.3.1.1.5.5 NORMAL "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep  7 10:25:56 2017 .1.3.6.1.6.3.1.1.5.5 Normal "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep  7 10:26:00 2017 .1.3.6.1.6.3.1.1.5.5 NORMAL "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep  7 10:26:00 2017 .1.3.6.1.6.3.1.1.5.5 Normal "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep  7 10:26:02 2017 .1.3.6.1.6.3.1.1.5.5 NORMAL "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep  7 10:26:02 2017 .1.3.6.1.6.3.1.1.5.5 Normal "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep  7 10:26:26 2017 .1.3.6.1.6.3.1.1.5.5 NORMAL "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep  7 10:26:26 2017 .1.3.6.1.6.3.1.1.5.5 Normal "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep  7 10:26:27 2017 .1.3.6.1.6.3.1.1.5.5 NORMAL "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep  7 10:26:27 2017 .1.3.6.1.6.3.1.1.5.5 Normal "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep  7 10:26:27 2017 .1.3.6.1.6.3.1.1.5.5 NORMAL "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47

So messages are coming in, but then in the Nagios web GUI I just have pending and 'No check results for service yet...' messages.

Nagios XI 5.4.8

Post by **cdienger** » Thu Sep 07, 2017 10:48 am

Is there anything in /usr/local/nagios/var/nagios.log?

The unconfigured objects list show devices using NSCA or NRDP, but you'll need to setup a snmptrap check. https://assets.nagios.com/downloads/nag ... ios_XI.pdf has more information on using the snmp trap wizard.

Post by **snapon_admin** » Thu Sep 07, 2017 11:01 am

The unconfigured objects list also shows traps coming in that haven't been configured yet, that was how I added this test host and I've definitely added trap alerts this way before. We have SNMP trap checks in Nagios already and have for some time, they just recently stopped updating so I checked the trap logs and can see the traps still coming in.

As for /usr/local/nagios/var/nagios.log there's plenty of stuff in there about traps and this host in particular.

Code: Select all

[1504796010] Warning:  Passive check result was received for service 'SNMP Traps' on host '10.10.250.6', but the host could not be found!
[1504796010] Error: External command failed -> PROCESS_SERVICE_CHECK_RESULT;10.10.250.6;SNMP Traps;0;An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47 / enterprises.9.2.1.5.0 ():10.10.129.47 enterprises.9.9.412.1.1.1.0 ():1 enterprises.9.9.412.1.1.2.0 ():10.10.129.47

Post by **snapon_admin** » Thu Sep 07, 2017 11:03 am

This is also in that log, but i'm pretty sure it's just a freshness check thing. Lisgrid01p one of the 2 main hosts we have in nagios that have been monitoring traps for awhile now.

Code: Select all

[1504788204] Warning: The results of service 'SNMP Traps' on host 'lisgrid01p' are stale by 0d 0h 0m 1s (threshold=2d 0h 0m 0s).  I'm forcing an immediate check of the service.

Post by **cdienger** » Thu Sep 07, 2017 12:38 pm

Do you see a host in the CCM for 10.10.250.6 ? Open /usr/local/nagios/var/status.dat and find the SNMP trap for the the host and post the entry(remove anything you want to keep private). It should look something like:

Code: Select all

servicestatus {
        host_name=10.10.250.6
        service_description=SNMP Traps
        modified_attributes=0
        ...
}

You can also try refreshing the config under Configure > CCCM > Tools > Config File Management. Click Delete Files, Write Configs, Verify Files, Restart Nagios Core in that order to refresh it.

Post by **snapon_admin** » Thu Sep 07, 2017 3:14 pm

That file doesn't seem to exist.

Code: Select all

[root@lisl-ngos-01-pv var]# ll /usr/local/nagios/var            
total 153384
drwxrwxr-x. 2 nagios nagios    73728 Sep  6 23:59 archives
-rw-r--r--. 1 apache apache 50013986 May 11  2015 graphapi.log
-rw-r--r--. 1 nagios nagios     7978 Apr 25 11:47 host-perfdata
-rw-r--r--. 1 nagios nagios    22084 Sep  7 13:35 nagios.configtest
-rw-r--r--. 1 nagios nagios        6 Sep  7 13:35 nagios.lock
-rw-rw-r--. 1 nagios nagios 12840408 Sep  7 15:13 nagios.log
-rw-rw-r--. 1 nagios users  19380236 Jan 15  2015 nagios.tmp1QpCnR
-rw-------. 1 nagios nagios  1323008 Jan 15  2015 nagios.tmpcY0Kxp
-rw-r--r--. 1 nagios nagios        6 Jun 22 12:24 ndo2db.lock
-rw-r--r--. 1 nagios nagios        0 Sep  7 13:34 ndomod.tmp
srwxr-xr-x. 1 nagios nagios        0 Jun 22 12:24 ndo.sock
-rw-r--r--. 1 nagios nagios  9346204 Apr 25 11:48 npcd.log
-rw-r--r--. 1 nagios nagios 10485817 Oct  1  2013 npcd.log.old
-rw-r--r--. 1 nagios nagios 13069207 Apr 25 11:46 objects.cache
-rw-r--r--. 1 nagios nagios 13529934 Sep  7 13:35 objects.precache
-rw-rw-rw-. 1 nagios nagios  5517295 Mar 16 14:00 perfdata.log
-rw-------. 1 nagios nagios 21084063 Sep  7 14:35 retention.dat
drwxrwsr-x. 2 nagios nagcmd     4096 Sep  7 13:35 rw
-rw-r--r--. 1 nagios nagios   230597 Apr 25 11:47 service-perfdata
drwxr-xr-x. 5 root   root       4096 Nov 13  2013 spool
drwxr-xr-x. 2 nagios nagios     4096 Sep  7 15:13 stats

scottwilkerson · Post by **scottwilkerson** » Thu Sep 07, 2017 3:17 pm

snapon_admin wrote:That file doesn't seem to exist.

Code: Select all

[root@lisl-ngos-01-pv var]# ll /usr/local/nagios/var            
total 153384
drwxrwxr-x. 2 nagios nagios    73728 Sep  6 23:59 archives
-rw-r--r--. 1 apache apache 50013986 May 11  2015 graphapi.log
-rw-r--r--. 1 nagios nagios     7978 Apr 25 11:47 host-perfdata
-rw-r--r--. 1 nagios nagios    22084 Sep  7 13:35 nagios.configtest
-rw-r--r--. 1 nagios nagios        6 Sep  7 13:35 nagios.lock
-rw-rw-r--. 1 nagios nagios 12840408 Sep  7 15:13 nagios.log
-rw-rw-r--. 1 nagios users  19380236 Jan 15  2015 nagios.tmp1QpCnR
-rw-------. 1 nagios nagios  1323008 Jan 15  2015 nagios.tmpcY0Kxp
-rw-r--r--. 1 nagios nagios        6 Jun 22 12:24 ndo2db.lock
-rw-r--r--. 1 nagios nagios        0 Sep  7 13:34 ndomod.tmp
srwxr-xr-x. 1 nagios nagios        0 Jun 22 12:24 ndo.sock
-rw-r--r--. 1 nagios nagios  9346204 Apr 25 11:48 npcd.log
-rw-r--r--. 1 nagios nagios 10485817 Oct  1  2013 npcd.log.old
-rw-r--r--. 1 nagios nagios 13069207 Apr 25 11:46 objects.cache
-rw-r--r--. 1 nagios nagios 13529934 Sep  7 13:35 objects.precache
-rw-rw-rw-. 1 nagios nagios  5517295 Mar 16 14:00 perfdata.log
-rw-------. 1 nagios nagios 21084063 Sep  7 14:35 retention.dat
drwxrwsr-x. 2 nagios nagcmd     4096 Sep  7 13:35 rw
-rw-r--r--. 1 nagios nagios   230597 Apr 25 11:47 service-perfdata
drwxr-xr-x. 5 root   root       4096 Nov 13  2013 spool
drwxr-xr-x. 2 nagios nagios     4096 Sep  7 15:13 stats

If you have a ram disk setup it may be located in /var/nagiosramdisk/

Post by **snapon_admin** » Thu Sep 07, 2017 3:26 pm

Thanks, Scott. Here's the entry for the test host:

Code: Select all

hoststatus {
	host_name=10.10.250.6
	modified_attributes=0
	check_command=check_dummy!0!"No data received yet."
	check_period=xi_timeperiod_24x7
	notification_period=xi_timeperiod_24x7
	check_interval=5.000000
	retry_interval=1.000000
	event_handler=
	has_been_checked=0
	should_be_scheduled=0
	check_execution_time=0.000
	check_latency=0.000
	check_type=0
	current_state=0
	last_hard_state=0
	last_event_id=0
	current_event_id=0
	current_problem_id=0
	last_problem_id=0
	plugin_output=
	long_plugin_output=
	performance_data=
	last_check=0
	next_check=0
	check_options=0
	current_attempt=1
	max_attempts=5
	state_type=1
	last_state_change=1504809277
	last_hard_state_change=1504809277
	last_time_up=0
	last_time_down=0
	last_time_unreachable=0
	last_notification=0
	next_notification=0
	no_more_notifications=0
	current_notification_number=0
	current_notification_id=0
	notifications_enabled=1
	problem_has_been_acknowledged=0
	acknowledgement_type=0
	active_checks_enabled=0
	passive_checks_enabled=1
	event_handler_enabled=1
	flap_detection_enabled=1
	process_performance_data=1
	obsess=1
	last_update=1504815744
	is_flapping=0
	percent_state_change=0.00
	scheduled_downtime_depth=0
	_XIWIZARD=0;passiveobject
	}

And here's the entry for the Trap host that we actually have been monitoring via traps for awhile:

Code: Select all

servicestatus {
	host_name=lisgrid01p
	service_description=SNMP Traps
	modified_attributes=0
	check_command=restart_snmptt!!!!!!!!
	check_period=xi_timeperiod_24x7
	notification_period=xi_timeperiod_24x7
	check_interval=1.000000
	retry_interval=1.000000
	event_handler=
	has_been_checked=1
	should_be_scheduled=0
	check_execution_time=15.789
	check_latency=0.000
	check_type=0
	current_state=0
	last_hard_state=0
	last_event_id=3635958
	current_event_id=3635959
	current_problem_id=0
	last_problem_id=1658571
	current_attempt=1
	max_attempts=1
	state_type=1
	last_state_change=1492361259
	last_hard_state_change=1492361259
	last_time_ok=1504788204
	last_time_warning=0
	last_time_unknown=0
	last_time_critical=1492361261
	plugin_output=No traps received, restarting snmptt in case it's stuck. snmptt Service restarted successfully.
	long_plugin_output=
	performance_data=
	last_check=1504788204
	next_check=1504788264
	check_options=0
	current_notification_number=0
	current_notification_id=1027212
	last_notification=0
	next_notification=0
	no_more_notifications=0
	notifications_enabled=1
	active_checks_enabled=0
	passive_checks_enabled=1
	event_handler_enabled=1
	problem_has_been_acknowledged=0
	acknowledgement_type=0
	flap_detection_enabled=0
	process_performance_data=1
	obsess=1
	last_update=1504815744
	is_flapping=0
	percent_state_change=0.00
	scheduled_downtime_depth=0
	_XIWIZARD=0;passiveobject
	}

scottwilkerson · Post by **scottwilkerson** » Thu Sep 07, 2017 3:39 pm

you need to create a "SNMP Traps" service for the host 10.10.250.6

If you run the SNMP Trap Wizard and select that 10.10.250.6 as the host

Post by **snapon_admin** » Thu Sep 07, 2017 4:18 pm

I did that. And that also doesn't explain why the other 2 devices we have in been monitoring traps for years on are receiving traps (logs show the traps coming in) but not updating Nagios. 2 hours ago it appears the new test host I added did receive a trap, but it hasn't updated since. Is there a limit to how many traps nagios can receive before it just freaks out and can't handle it anymore? I've noticed that we have a ton of unconfigured hosts sending Nagios traps that Nagios just isn't configured to monitor. Could this just be a case of too many devices sending traps?

Top host in this screen is the "test" host I added temporarily just to see if traps would update nagios at all. The lisgrid01p device gets an error on the device when trying to send a trap, and the keno-etrk-01-pv host sends the trap but nagios never updates.

Nagios Support Forum

SNMP traps being received but not updating in Nagios

SNMP traps being received but not updating in Nagios

Re: SNMP traps being received but not updating in Nagios

Re: SNMP traps being received but not updating in Nagios

Re: SNMP traps being received but not updating in Nagios

Re: SNMP traps being received but not updating in Nagios

Re: SNMP traps being received but not updating in Nagios

Re: SNMP traps being received but not updating in Nagios

Re: SNMP traps being received but not updating in Nagios

Re: SNMP traps being received but not updating in Nagios

Re: SNMP traps being received but not updating in Nagios