This support forum board is for support questions relating to
Nagios XI , our flagship commercial network monitoring solution.
snapon_admin
Posts: 952 Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:
Post
by snapon_admin » Thu Sep 07, 2017 10:31 am
I'm seeing a ton of traps coming into my Nagios server and my snmptt.log and snmpttunknown.log files are updating with all of the new messages coming in, but nothing is changing on the GUI side of things. Any thoughts on why that might be? To test this I added a host that was showing up in unconfigured objects, it's a firewall I believe and we're not monitoring traps on it but it is configured to send them to Nagios for some reason. This is snmptt.log output from 5 minutes ago:
Code: Select all
Thu Sep 7 10:25:56 2017 .1.3.6.1.6.3.1.1.5.5 NORMAL "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep 7 10:25:56 2017 .1.3.6.1.6.3.1.1.5.5 Normal "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep 7 10:26:00 2017 .1.3.6.1.6.3.1.1.5.5 NORMAL "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep 7 10:26:00 2017 .1.3.6.1.6.3.1.1.5.5 Normal "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep 7 10:26:02 2017 .1.3.6.1.6.3.1.1.5.5 NORMAL "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep 7 10:26:02 2017 .1.3.6.1.6.3.1.1.5.5 Normal "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep 7 10:26:26 2017 .1.3.6.1.6.3.1.1.5.5 NORMAL "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep 7 10:26:26 2017 .1.3.6.1.6.3.1.1.5.5 Normal "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep 7 10:26:27 2017 .1.3.6.1.6.3.1.1.5.5 NORMAL "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep 7 10:26:27 2017 .1.3.6.1.6.3.1.1.5.5 Normal "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
Thu Sep 7 10:26:27 2017 .1.3.6.1.6.3.1.1.5.5 NORMAL "Status Events" 10.10.250.6 - An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47
So messages are coming in, but then in the Nagios web GUI I just have pending and 'No check results for service yet...' messages.
Nagios XI 5.4.8
cdienger
Support Tech
Posts: 5045 Joined: Tue Feb 07, 2017 11:26 am
Post
by cdienger » Thu Sep 07, 2017 10:48 am
Is there anything in
/usr/local/nagios/var/nagios.log ?
The unconfigured objects list show devices using NSCA or NRDP, but you'll need to setup a snmptrap check.
https://assets.nagios.com/downloads/nag ... ios_XI.pdf has more information on using the snmp trap wizard.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new
Privacy Policy .
snapon_admin
Posts: 952 Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:
Post
by snapon_admin » Thu Sep 07, 2017 11:01 am
The unconfigured objects list also shows traps coming in that haven't been configured yet, that was how I added this test host and I've definitely added trap alerts this way before. We have SNMP trap checks in Nagios already and have for some time, they just recently stopped updating so I checked the trap logs and can see the traps still coming in.
As for /usr/local/nagios/var/nagios.log there's plenty of stuff in there about traps and this host in particular.
Code: Select all
[1504796010] Warning: Passive check result was received for service 'SNMP Traps' on host '10.10.250.6', but the host could not be found!
[1504796010] Error: External command failed -> PROCESS_SERVICE_CHECK_RESULT;10.10.250.6;SNMP Traps;0;An authenticationFailure trap signifies that the SNMP 10.10.129.47 1 10.10.129.47 / enterprises.9.2.1.5.0 ():10.10.129.47 enterprises.9.9.412.1.1.1.0 ():1 enterprises.9.9.412.1.1.2.0 ():10.10.129.47
snapon_admin
Posts: 952 Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:
Post
by snapon_admin » Thu Sep 07, 2017 11:03 am
This is also in that log, but i'm pretty sure it's just a freshness check thing. Lisgrid01p one of the 2 main hosts we have in nagios that have been monitoring traps for awhile now.
Code: Select all
[1504788204] Warning: The results of service 'SNMP Traps' on host 'lisgrid01p' are stale by 0d 0h 0m 1s (threshold=2d 0h 0m 0s). I'm forcing an immediate check of the service.
cdienger
Support Tech
Posts: 5045 Joined: Tue Feb 07, 2017 11:26 am
Post
by cdienger » Thu Sep 07, 2017 12:38 pm
Do you see a host in the CCM for 10.10.250.6 ? Open
/usr/local/nagios/var/status.dat and find the SNMP trap for the the host and post the entry(remove anything you want to keep private). It should look something like:
Code: Select all
servicestatus {
host_name=10.10.250.6
service_description=SNMP Traps
modified_attributes=0
...
}
You can also try refreshing the config under Configure > CCCM > Tools > Config File Management. Click Delete Files, Write Configs, Verify Files, Restart Nagios Core in that order to refresh it.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new
Privacy Policy .
snapon_admin
Posts: 952 Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:
Post
by snapon_admin » Thu Sep 07, 2017 3:14 pm
That file doesn't seem to exist.
Code: Select all
[root@lisl-ngos-01-pv var]# ll /usr/local/nagios/var
total 153384
drwxrwxr-x. 2 nagios nagios 73728 Sep 6 23:59 archives
-rw-r--r--. 1 apache apache 50013986 May 11 2015 graphapi.log
-rw-r--r--. 1 nagios nagios 7978 Apr 25 11:47 host-perfdata
-rw-r--r--. 1 nagios nagios 22084 Sep 7 13:35 nagios.configtest
-rw-r--r--. 1 nagios nagios 6 Sep 7 13:35 nagios.lock
-rw-rw-r--. 1 nagios nagios 12840408 Sep 7 15:13 nagios.log
-rw-rw-r--. 1 nagios users 19380236 Jan 15 2015 nagios.tmp1QpCnR
-rw-------. 1 nagios nagios 1323008 Jan 15 2015 nagios.tmpcY0Kxp
-rw-r--r--. 1 nagios nagios 6 Jun 22 12:24 ndo2db.lock
-rw-r--r--. 1 nagios nagios 0 Sep 7 13:34 ndomod.tmp
srwxr-xr-x. 1 nagios nagios 0 Jun 22 12:24 ndo.sock
-rw-r--r--. 1 nagios nagios 9346204 Apr 25 11:48 npcd.log
-rw-r--r--. 1 nagios nagios 10485817 Oct 1 2013 npcd.log.old
-rw-r--r--. 1 nagios nagios 13069207 Apr 25 11:46 objects.cache
-rw-r--r--. 1 nagios nagios 13529934 Sep 7 13:35 objects.precache
-rw-rw-rw-. 1 nagios nagios 5517295 Mar 16 14:00 perfdata.log
-rw-------. 1 nagios nagios 21084063 Sep 7 14:35 retention.dat
drwxrwsr-x. 2 nagios nagcmd 4096 Sep 7 13:35 rw
-rw-r--r--. 1 nagios nagios 230597 Apr 25 11:47 service-perfdata
drwxr-xr-x. 5 root root 4096 Nov 13 2013 spool
drwxr-xr-x. 2 nagios nagios 4096 Sep 7 15:13 stats
scottwilkerson
DevOps Engineer
Posts: 19396 Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:
Post
by scottwilkerson » Thu Sep 07, 2017 3:17 pm
snapon_admin wrote: That file doesn't seem to exist.
Code: Select all
[root@lisl-ngos-01-pv var]# ll /usr/local/nagios/var
total 153384
drwxrwxr-x. 2 nagios nagios 73728 Sep 6 23:59 archives
-rw-r--r--. 1 apache apache 50013986 May 11 2015 graphapi.log
-rw-r--r--. 1 nagios nagios 7978 Apr 25 11:47 host-perfdata
-rw-r--r--. 1 nagios nagios 22084 Sep 7 13:35 nagios.configtest
-rw-r--r--. 1 nagios nagios 6 Sep 7 13:35 nagios.lock
-rw-rw-r--. 1 nagios nagios 12840408 Sep 7 15:13 nagios.log
-rw-rw-r--. 1 nagios users 19380236 Jan 15 2015 nagios.tmp1QpCnR
-rw-------. 1 nagios nagios 1323008 Jan 15 2015 nagios.tmpcY0Kxp
-rw-r--r--. 1 nagios nagios 6 Jun 22 12:24 ndo2db.lock
-rw-r--r--. 1 nagios nagios 0 Sep 7 13:34 ndomod.tmp
srwxr-xr-x. 1 nagios nagios 0 Jun 22 12:24 ndo.sock
-rw-r--r--. 1 nagios nagios 9346204 Apr 25 11:48 npcd.log
-rw-r--r--. 1 nagios nagios 10485817 Oct 1 2013 npcd.log.old
-rw-r--r--. 1 nagios nagios 13069207 Apr 25 11:46 objects.cache
-rw-r--r--. 1 nagios nagios 13529934 Sep 7 13:35 objects.precache
-rw-rw-rw-. 1 nagios nagios 5517295 Mar 16 14:00 perfdata.log
-rw-------. 1 nagios nagios 21084063 Sep 7 14:35 retention.dat
drwxrwsr-x. 2 nagios nagcmd 4096 Sep 7 13:35 rw
-rw-r--r--. 1 nagios nagios 230597 Apr 25 11:47 service-perfdata
drwxr-xr-x. 5 root root 4096 Nov 13 2013 spool
drwxr-xr-x. 2 nagios nagios 4096 Sep 7 15:13 stats
If you have a ram disk setup it may be located in
/var/nagiosramdisk/
snapon_admin
Posts: 952 Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:
Post
by snapon_admin » Thu Sep 07, 2017 3:26 pm
Thanks, Scott. Here's the entry for the test host:
Code: Select all
hoststatus {
host_name=10.10.250.6
modified_attributes=0
check_command=check_dummy!0!"No data received yet."
check_period=xi_timeperiod_24x7
notification_period=xi_timeperiod_24x7
check_interval=5.000000
retry_interval=1.000000
event_handler=
has_been_checked=0
should_be_scheduled=0
check_execution_time=0.000
check_latency=0.000
check_type=0
current_state=0
last_hard_state=0
last_event_id=0
current_event_id=0
current_problem_id=0
last_problem_id=0
plugin_output=
long_plugin_output=
performance_data=
last_check=0
next_check=0
check_options=0
current_attempt=1
max_attempts=5
state_type=1
last_state_change=1504809277
last_hard_state_change=1504809277
last_time_up=0
last_time_down=0
last_time_unreachable=0
last_notification=0
next_notification=0
no_more_notifications=0
current_notification_number=0
current_notification_id=0
notifications_enabled=1
problem_has_been_acknowledged=0
acknowledgement_type=0
active_checks_enabled=0
passive_checks_enabled=1
event_handler_enabled=1
flap_detection_enabled=1
process_performance_data=1
obsess=1
last_update=1504815744
is_flapping=0
percent_state_change=0.00
scheduled_downtime_depth=0
_XIWIZARD=0;passiveobject
}
And here's the entry for the Trap host that we actually have been monitoring via traps for awhile:
Code: Select all
servicestatus {
host_name=lisgrid01p
service_description=SNMP Traps
modified_attributes=0
check_command=restart_snmptt!!!!!!!!
check_period=xi_timeperiod_24x7
notification_period=xi_timeperiod_24x7
check_interval=1.000000
retry_interval=1.000000
event_handler=
has_been_checked=1
should_be_scheduled=0
check_execution_time=15.789
check_latency=0.000
check_type=0
current_state=0
last_hard_state=0
last_event_id=3635958
current_event_id=3635959
current_problem_id=0
last_problem_id=1658571
current_attempt=1
max_attempts=1
state_type=1
last_state_change=1492361259
last_hard_state_change=1492361259
last_time_ok=1504788204
last_time_warning=0
last_time_unknown=0
last_time_critical=1492361261
plugin_output=No traps received, restarting snmptt in case it's stuck. snmptt Service restarted successfully.
long_plugin_output=
performance_data=
last_check=1504788204
next_check=1504788264
check_options=0
current_notification_number=0
current_notification_id=1027212
last_notification=0
next_notification=0
no_more_notifications=0
notifications_enabled=1
active_checks_enabled=0
passive_checks_enabled=1
event_handler_enabled=1
problem_has_been_acknowledged=0
acknowledgement_type=0
flap_detection_enabled=0
process_performance_data=1
obsess=1
last_update=1504815744
is_flapping=0
percent_state_change=0.00
scheduled_downtime_depth=0
_XIWIZARD=0;passiveobject
}
scottwilkerson
DevOps Engineer
Posts: 19396 Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:
Post
by scottwilkerson » Thu Sep 07, 2017 3:39 pm
you need to create a "SNMP Traps" service for the host 10.10.250.6
If you run the SNMP Trap Wizard and select that 10.10.250.6 as the host
snapon_admin
Posts: 952 Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:
Post
by snapon_admin » Thu Sep 07, 2017 4:18 pm
I did that. And that also doesn't explain why the other 2 devices we have in been monitoring traps for years on are receiving traps (logs show the traps coming in) but not updating Nagios. 2 hours ago it appears the new test host I added did receive a trap, but it hasn't updated since. Is there a limit to how many traps nagios can receive before it just freaks out and can't handle it anymore? I've noticed that we have a ton of unconfigured hosts sending Nagios traps that Nagios just isn't configured to monitor. Could this just be a case of too many devices sending traps?
Top host in this screen is the "test" host I added temporarily just to see if traps would update nagios at all. The lisgrid01p device gets an error on the device when trying to send a trap, and the keno-etrk-01-pv host sends the trap but nagios never updates.
You do not have the required permissions to view the files attached to this post.