Page 1 of 2

Nagios XI SNMP Traps (SNMPTT)(Missing Traps)

Posted: Tue Jun 10, 2014 4:16 pm
by mlopez
Hi All,
I've noticed since running 2012R2.9 a discrepancy in SNMP Traps. I currently run another nms which forwards all traps to SNMPTT and then SNMPTT uses "EXEC /usr/local/bin/snmptraphandling.py" to forward the traps to NagiosXI. I thought maybe the upgrade to 2014R1.1 would have resolved the issue but it hasn't.

Here is the scenario.
Every (1 hour) I receive a trap advising X service from all my systems which advises (all fine) on the other product, but when auditing SNMPTT logs (/var/log/snmptt/snmptt.log) I'm not seeing this trap every hour but at random intervals.


Example:
-I'll see the Trap at 5:00AM on Onms and at 5:10AM on SNMPTT (This is fine as I'm forwarding traps so there will be some delay)
-I'll see a Trap at 6:00AM on Onms and nothing on SNMPTT
- I'll see a Trap at 7:00AM on Onms and nothing on SNMPTT
- I'll see a Trap at 8:00AM on Onms and I'll see it at 8:15 on SNMPTT (This is fine as I'm forwarding traps so there will be some delay)

What I'm wondering is what happened to the 6 and 7AM Traps, btw the other nms is on the same Local Network and I have not Wirresharked to troubleshoot but just wondering if there is some sort of "caching" which is done by default on snmptrapd or snmptt which could cause this issue.


Now the big issue here is if I'm missing other traps and thus this is a very important variable.

-------------------

Version of SNMPTT:

Code: Select all

[root@NagiosXI processed_mibs]# rpm -aq |grep snmptt
snmptt-1.4-0.9.beta2.el6.noarch

I am wondering if SNMPTT has some sort of "caching" where is matches same alerts from same device together but as per the /etc/snmp/snmptt.ini configuration I do not have that enabled as shown below:

Code: Select all

# How often duplicate traps will be processed.  An MD5 hash of all incoming traps
# is stored in memory and is used to check for duplicates.  All variables except for
# the uptime variable are used when calculating the MD5.  The larger this variable,
# the more memory snmptt will require.
# Note:  In most cases it may be a good idea to enable this but sometimes it can have a
#        negative effect.  For example, if you are trying to troubleshoot a wireless device
#        that keeps losing it's connection you may want to disable this so that you see
#        all the associations and disassociations.
# 5 minutes = 300
# 10 minutes = 600
# 15 minutes = 900
duplicate_trap_window = 0

Any help would be greatly appreciated.

Re: Nagios XI SNMP Traps (SNMPTT)(Missing Traps)

Posted: Tue Jun 10, 2014 4:35 pm
by sreinhardt
Would the 6 and 7 am traps, possibly have a different OID, maybe due to one or more devices not being OK? Have you taken a look at the snmpttunknown.log? Also just a reminder snmp is a UDP protocol, it is entirely possible that the trap never made it along the way, and there would be no retry or notification to either device.

Re: Nagios XI SNMP Traps (SNMPTT)(Missing Traps)

Posted: Tue Jun 10, 2014 4:50 pm
by mlopez
Hi Spencer,
This specific SNMP Trap which will call (Heartbeat) is always the same OID for all the instances and it's not in snmpttunknown.log as I have other traps in there which I have yet to map which means the snmpttunknown.log is being populated.

As you mentioned SNMP traps can be TCP or UDP in my case UDP as I like my network unsaturated;) and yes completely possible as UDP doesn't wait for an acknowledgement but what I find strange is that it's in the local network NMS -> "NAGIOS SNMPTRAPD -> SNMPTT" and again it's completely possible there are underlying local network packet loss which again I have yet to troubleshoot thoroughly before judging SNMPTRAPD and SNMPTT.

I quickly checked the network but before doing hours of network troubleshooting with my fluke, Is there any caching like mentioned in my first post which I could have missed?

Here are my SNMPTRAPD and SNMPTT settings (btw I used NagiosXI-SNMPTrap-setup.sh to install SNMPTRAPD + SNMPTT + SETTINGS)

And thanks again Spencer for all your help

SNMPTRAPD:

Code: Select all

[root@NagiosXI snmp]# cat /etc/snmp/snmptrapd.conf
disableAuthorization yes
traphandle default /usr/sbin/snmptthandler
SNMPTT.ini:

Code: Select all

#
# SNMPTT v1.4beta2 Configuration File
#
# Linux / Unix
#

[General]
# Name of this system for $H variable.  If blank, system name will be the computer's
# hostname via Sys::Hostname.
snmptt_system_name =

# Set to either 'standalone' or 'daemon'
# standalone: snmptt called from snmptrapd.conf
# daemon: snmptrapd.conf calls snmptthandler
# Ignored by Windows.  See documentation
mode = daemon

# Set to 1 to allow multiple trap definitions to be executed for the same trap.
# Set to 0 to have it stop after the first match.
# This option should normally be set to 1.  See the section 'SNMPTT.CONF Configuration
# file Notes' in the SNMPTT documentation for more information.
# Note: Wildcard matches are only matched if there are NO exact matches.  This takes
#       into consideration the NODES list.  Therefore, if there is a matching trap, but
#       the NODES list prevents it from being considered a match, the wildcard entry will
#       only be used if there are no other exact matches.
multiple_event = 1

# SNMPTRAPD passes the IP address of device sending the trap, and the IP address of the
# actual SNMP agent.  These addresses could differ if the trap was sent on behalf of another
# device (relay, proxy etc).
# If DNS is enabled, the agent IP address is converted to a host name using a DNS lookup
# (which includes the local hosts file, depending on how the OS is configured).  This name
# will be used for: NODES entry matches, hostname field in logged traps (file / database),
# and the $A variable.  Host names on the NODES line will be resolved and the IP address
# will then be used for comparing.
# Set to 0 to disable DNS resolution
# Set to 1 to enable DNS resolution
dns_enable = 0

# Set to 0 to enable the use of FQDN (Fully Qualified Domain Names).  If a host name is
# passed to SNMPTT that contains a domain name, it will not be altered in any way by
# SNMPTT.  This also affects resolve_value_ip_addresses.
# Set to 1 to have SNMPTT strip the domain name from the host name passed to it.  For
# example, server01.domain.com would be changed to server01
# Set to 2 to have SNMPTT strip the domain name from the host name passed to it
# based on the list of domains in strip_domain_list
strip_domain = 1

# List of domain names that should be stripped when strip_domain is set to 2.
# List can contain one or more domains.  For example, if the FQDN of a host is
# server01.city.domain.com and the list contains domain.com, the 'host' will be
# set as server01.city.
strip_domain_list = <<END
domain.com
END

# Configures how IP addresses contained in the VALUE of the variable bindings are handled.
# This only applies to the values for $n, $+n, $-n, $vn, $+*, $-*.
# Set to 0 to disable resolving ip address to host names
# Set to 1 to enable resolving ip address to host names
# Note: net_snmp_perl_enable *must* be enabled.  The strip_domain settings influence the
# format of the resolved host name.  DNS must be enabled (dns_enable)
resolve_value_ip_addresses = 0

# Set to 1 to enable the use of the Perl module from the UCD-SNMP / NET-SNMP package.
# This is required for $v variable substitution to work, and also for some other options
# that are enabled in this .ini file.
# Set to 0 to disable the use of the Perl module from the UCD-SNMP / NET-SNMP package.
# Note: Enabling this with stand-alone mode can cause SNMPTT to run very slowly due to
#       the loading of the MIBS at startup.
net_snmp_perl_enable = 1

# Set to 1 to enable caching of OID and ENUM translations when net_snmp_perl_enable is
# enabled.  Enabling this should result in faster translations.
# Set to 0 to disable caching.
# Note: Restart SNMPTT after updating the MIB files for Net-SNMP, otherwise the cache may
# contain inaccurate data.  Defaults to 1.
net_snmp_perl_cache_enable = 1

# This sets the best_guess parameter used by the UCD-SNMP / NET-SNMP Perl module for
# translating symbolic nams to OIDs and vice versa.
# For UCD-SNMP, and Net-SNMP 5.0.8 and previous versions, set this value to 0.
# For Net-SNMP 5.0.9, or any Net-SNMP with patch 722075 applied, set this value to 2.
# A value of 2 is equivalent to -IR on Net-SNMP command line utilities.
# UCD-SNMP and Net-SNMP 5.0.8 and previous may not be able to translate certain formats of
# symbolic names such as RFC1213-MIB::sysDescr.  Net-SNMP 5.0.9 or patch 722075 will allow
# all possibilities to be translated.  See the FAQ section in the README for more info
net_snmp_perl_best_guess = 0

# Configures how the OID of the received trap is handled when outputting to a log file /
# database.  It does NOT apply to the $O variable.
# Set to 0 to use the default of numerical OID
# Set to 1 to translate the trap OID to short text (symbolic form) (eg: linkUp)
# Set to 2 to translate the trap OID to short text with module name (eg: IF-MIB::linkUp)
# Set to 3 to translate the trap OID to long text (eg: iso...snmpTraps.linkUp)
# Set to 4 to translate the trap OID to long text with module name (eg:
# IF-MIB::iso...snmpTraps.linkUp)
# Note: -The output of the long format will vary depending on the version of Net-SNMP you
#        are using.
#       -net_snmp_perl_enable *must* be enabled
#       -If using database logging, ensure the trapoid column is large enough to hold the
#        entire line
translate_log_trap_oid = 0

# Configures how OIDs contained in the VALUE of the variable bindings are handled.
# This only applies to the values for $n, $+n, $-n, $vn, $+*, $-*.  For substitutions
# that include variable NAMES ($+n etc), only the variable VALUE is affected.
# Set to 0 to disable translating OID values to text (symbolic form)
# Set to 1 to translate OID values to short text (symbolic form) (eg: BuildingAlarm)
# Set to 2 to translate OID values to short text with module name (eg: UPS-MIB::BuildingAlarm)
# Set to 3 to translate OID values to long text (eg: iso...upsAlarm.BuildingAlarm)
# Set to 4 to translate OID values to long text with module name (eg:
# UPS-MIB::iso...upsAlarm.BuildingAlarm)
# For example, if the value contained: 'A UPS Alarm (.1.3.6.1.4.1.534.1.7.12) has cleared.',
# it could be translated to: 'A UPS Alarm (UPS-MIB::BuildingAlarm) has cleared.'
# Note: net_snmp_perl_enable *must* be enabled
translate_value_oids = 1

# Configures how the symbolic enterprise OID will be displayed for $E.
# Set to 1, 2, 3 or 4.  See translate_value_oids options 1,2,3 and 4.
# Note: net_snmp_perl_enable *must* be enabled
translate_enterprise_oid_format = 1

# Configures how the symbolic trap OID will be displayed for $O.
# Set to 1, 2, 3 or 4.  See translate_value_oids options 1,2,3 and 4.
# Note: net_snmp_perl_enable *must* be enabled
translate_trap_oid_format = 1

# Configures how the symbolic trap OID will be displayed for $v, $-n, $+n, $-* and $+*.
# Set to 1, 2, 3 or 4.  See translate_value_oids options 1,2,3 and 4.
# Note: net_snmp_perl_enable *must* be enabled
translate_varname_oid_format = 1

# Set to 0 to disable converting INTEGER values to enumeration tags as defined in the
# MIB files
# Set to 1 to enable converting INTEGER values to enumeration tags as defined in the
# MIB files
# Example: moverDoorState:open instead of moverDoorState:2
# Note: net_snmp_perl_enable *must* be enabled
translate_integers = 1

# Allows you to set the MIBS environment variable used by SNMPTT
# Leave blank or comment out to have the systems enviroment settings used
# To have all MIBS processed, set to ALL
# See the snmp.conf manual page for more info
#mibs_environment = ALL

# Set what is used to separate variables when wildcards are expanded on the FORMAT /
# EXEC line.  Defaults to a space.  Value MUST be within quotes.  Can contain 1 or
# more characters
wildcard_expansion_separator = " "

# Set to 1 to allow unsafe REGEX code to be executed.
# Set to 0 to prevent unsafe REGEX code from being executed (default).
# Enabling unsafe REGEX code will allow variable interopolation and the use of the e
# modifier to allow statements such as substitution with captures such
# as:            (one (two) three)(five $1 six)
# which outputs: five two six
# or:            (one (two) three)("five ".length($1)." six")e
# which outputs: five 3 six
#
# This is considered unsafe because the contents of the regular expression
# (right) is executed (eval) by Perl which *could contain unsafe code*.
# BE SURE THAT THE SNMPTT CONFIGURATION FILES ARE SECURE!
allow_unsafe_regex = 0

# Set to 1 to have the backslash (escape) removed from quotes passed from
# snmptrapd.  For example, \" would be changed to just "
# Set to 0 to disable
remove_backslash_from_quotes = 0

# Set to 1 to have NODES files loaded each time a trap is processed.
# Set to 0 to have all NODES files loaded when the snmptt.conf files are loaded.
# If NODES files are used (files that contain lists of NODES), then setting to 1
# will cause the list to be loaded each time an EVENT is processed that uses
# NODES files.  This will allow the NODES file to be modified while SNMPTT is
# running but can result in many file reads depending on the number of traps
# received.  Defaults to 0
dynamic_nodes = 0

# This option allows you to use the $D substitution variable to include the
# description text from the SNMPTT.CONF or MIB files.
# Set to 0 to disable the $D substitution variable.  If $D is used, nothing
#  will be outputted.
# Set to 1 to enable the $D substitution variable and have it use the
#  descriptions stored in the SNMPTT .conf files.  Enabling this option can
#  greatly increase the amount of memory used by SNMPTT.
# Set to 2 to enable the $D substitution variable and have it use the
#  description from the MIB files.  This enables the UCD-SNMP / NET-SNMP Perl
#  module save_descriptions variable.  Enabling this option can greatly
#  increase the amount of memory used by the Net-SNMP SNMP Perl module, which
#  will result in an increase of memory usage by SNMPTT.
description_mode = 1

# Set to 1 to remove any white space at the start of each line from the MIB
# or SNMPTT.CONF description when description_mode is set to 1 or 2.
description_clean = 1

# Warning: Experimental.  Not recommended for production environments.
#          When threads are enabled, SNMPTT may quit unexpectedly.
# Set to 1 to enable threads (ithreads) in Perl 5.6.0 or higher.  If enabled,
# EXEC will launch in a thread to allow SNMPTT to continue processing other
# traps.  See also threads_max.
# Set to 0 to disable threads (ithreads).
# Defaults to 0
threads_enable = 0

# Warning: Experimental.  Not recommended for production environments.
#          When threads are enabled, SNMPTT may quit unexpectedly.
# This option allows you to set the maximum number of threads that will
# execute at once.  Defaults to 10
threads_max = 10

# The date format for $x in strftime() format.  If not defined, defaults
# to %a %b %e %Y.
#date_format = %a %b %e %Y

# The time format for $X in strftime() format.  If not defined, defaults
# to %H:%M:%S.
#time_format = %H:%M:%S

# The date time format in strftime() format for the date/time when logging
# to standard output, snmptt log files (log_file) and the unknown log file
# (unknown_trap_log_file).  Defaults to localtime().  For SQL, see
# date_time_format_sql.
# Example:  %a %b %e %Y %H:%M:%S
#date_time_format =

[DaemonMode]
# Set to 1 to have snmptt fork to the background when run in daemon mode
# Ignored by Windows.  See documentation
daemon_fork = 1

# Set to the numerical user id (eg: 500) or textual user id (eg: snmptt)
# that snmptt should change to when running in daemon mode.  Leave blank
# to disable.  The user used should have read/write access to all log
# files, the spool folder, and read access to the configuration files.
# Only use this if you are starting snmptt as root.
# A second (child) process will be started as the daemon_uid user so
# there will be two snmptt processes running.  The first process will
# continue to run as the user that ran snmptt (root), waiting for the
# child to quit.  After the child quits, the parent process will remove
# the snmptt.pid file and exit.
daemon_uid = snmptt

# Complete path of file to store process ID when running in daemon mode.
pid_file = /var/run/snmptt.pid

# Directory to read received traps from.  Ex: /var/spool/snmptt/
# Don't forget the trailing slash!
spool_directory = /var/spool/snmptt/

# Amount of time in seconds to sleep between processing spool files
sleep = 5

# Set to 1 to have SNMPTT use the time that the trap was processed by SNMPTTHANDLER
# Set to 0 to have SNMPTT use the time the trap was processed.  Note:  Using 0 can
# result in the time being off by the number of seconds used for 'sleep'
use_trap_time = 1

# Set to 0 to have SNMPTT erase the spooled trap file after it attempts to process
# the trap even if it did not successfully log the trap to any of the log systems.
# Set to 1 to have SNMPTT erase the spooled trap file only after it successfully
# logs to at least ONE log system.
# Set to 2 to have SNMPTT erase the spooled trap file only after it successfully
# logs to ALL of the enabled log systems.  Warning:  If multiple log systems are
# enabled and only one fails, the other log system will continuously be logged to
# until ALL of the log systems function.
# The recommended setting is 1 with only one log system enabled.
keep_unlogged_traps = 1

# How often duplicate traps will be processed.  An MD5 hash of all incoming traps
# is stored in memory and is used to check for duplicates.  All variables except for
# the uptime variable are used when calculating the MD5.  The larger this variable,
# the more memory snmptt will require.
# Note:  In most cases it may be a good idea to enable this but sometimes it can have a
#        negative effect.  For example, if you are trying to troubleshoot a wireless device
#        that keeps losing it's connection you may want to disable this so that you see
#        all the associations and disassociations.
# 5 minutes = 300
# 10 minutes = 600
# 15 minutes = 900
duplicate_trap_window = 0

[Logging]
# Set to 1 to enable messages to be sent to standard output, or 0 to disable.
# Would normally be disabled unless you are piping this program to another
stdout_enable = 0

# Set to 1 to enable text logging of *TRAPS*.  Make sure you specify a log_file
# location
log_enable = 1

# Log file location.  The COMPLETE path and filename.  Ex: '/var/log/snmptt/snmptt.log'
log_file = /var/log/snmptt/snmptt.log

# Set to 1 to enable text logging of *SNMPTT system errors*.  Make sure you
# specify a log_system_file location
log_system_enable = 1

# Log file location.  The COMPLETE path and filename.
# Ex: '/var/log/snmptt/snmpttsystem.log'
log_system_file = /var/log/snmptt/snmpttsystem.log

# Set to 1 to enable logging of unknown traps.  This should normally be left off
# as the file could grow large quickly.  Used primarily for troubleshooting.  If
# you have defined a trap in snmptt.conf, but it is not executing, enable this to
# see if it is being considered an unknown trap due to an incorrect entry or
# simply missing from the snmptt.conf file.
# Unknown traps can be logged either a text file, a SQL table or both.
# See SQL section to define a SQL table to log unknown traps to.
unknown_trap_log_enable = 1

# Unknown trap log file location.  The COMPLETE path and filename.
# Ex: '/var/log/snmptt/snmpttunknown.log'
# Leave blank to disable logging to text file if logging to SQL is enabled
# for unknown traps
unknown_trap_log_file = /var/log/snmptt/snmpttunknown.log

# How often in seconds statistics should be logged to syslog or the event log.
# Set to 0 to disable
# 1 hour = 216000
# 12 hours = 2592000
# 24 hours = 5184000
statistics_interval = 0

# Set to 1 to enable logging of *TRAPS* to syslog.  If you do not have the Sys::Syslog
# module then disable this.  Windows users should disable this.
syslog_enable = 1

# Syslog facility to use for logging of *TRAPS*.  For example: 'local0'
syslog_facility = local0

# Set the syslog level for *TRAPS* based on the severity level of the trap
# as defined in the snmptt.conf file.  Values must be one per line between
# the syslog_level_* and END lines, and are not case sensitive.  For example:
#   Warning
#   Critical
# Duplicate definitions will use the definition with the higher severity.
syslog_level_debug = <<END
END
syslog_level_info = <<END
END
syslog_level_notice = <<END
END
syslog_level_warning = <<END
END
syslog_level_err = <<END
END
syslog_level_crit = <<END
END
syslog_level_alert = <<END
END

# Syslog default level to use for logging of *TRAPS*.  For example: warning
# Valid values: emerg, alert, crit, err, warning, notice, info, debug
syslog_level = info

# Set to 1 to enable logging of *SNMPTT system errors* to syslog.  If you do not have the
# Sys::Syslog module then disable this.  Windows users should disable this.
syslog_system_enable = 1

# Syslog facility to use for logging of *SNMPTT system errors*.  For example: 'local0'
syslog_system_facility = local0

# Syslog level to use for logging of *SNMPTT system errors*..  For example: 'warning'
# Valid values: emerg, alert, crit, err, warning, notice, info, debug
syslog_system_level = warning

[SQL]
# Determines if the enterprise column contains the numeric OID or symbolic OID
# Set to 0 for numeric OID
# Set to 1 for symbolic OID
# Uses translate_enterprise_oid_format to determine format
# Note: net_snmp_perl_enable *must* be enabled
db_translate_enterprise = 0

# FORMAT line to use for unknown traps.  If not defined, defaults to $-*.
db_unknown_trap_format = '$-*'

# List of custom SQL column names and values for the table of received traps
# (defined by *_table below).  The format is
#   column name
#   value
#
# For example:
#
#   binding_count
#   $#
#   uptime2
#   The agent has been up for $T.
sql_custom_columns = <<END
END

# List of custom SQL column names and values for the table of unknown traps
# (defined by *_table_unknown below).  See sql_custom_columns for the format.
sql_custom_columns_unknown = <<END
END

# MySQL: Set to 1 to enable logging to a MySQL database via DBI (Linux / Windows)
# This requires DBI:: and DBD::mysql
mysql_dbi_enable = 1

# MySQL: Hostname of database server (optional - default localhost)
mysql_dbi_host = localhost

# MySQL: Port number of database server (optional - default 3306)
mysql_dbi_port = 3306
# MySQL: How often in seconds the database should be 'pinged' to ensure the
# connection is still valid.  If *any* error is generate by the ping such as
# 'Unable to connect to database', it will attempt to re-create the database
# connection.  Set to 0 to disable pinging.
# Note:  This has no effect on mysql_ping_on_insert.
# disabled = 0
# 5 minutes = 300
# 15 minutes = 900
# 30 minutes = 1800
mysql_ping_interval = 300

# PostgreSQL: Set to 1 to enable logging to a PostgreSQL database via DBI (Linux / Windows)
# This requires DBI:: and DBD::PgPP
postgresql_dbi_enable = 0

# Set to 0 to use the DBD::PgPP module
# Set to 1 to use the DBD::Pg module
postgresql_dbi_module = 0

# Set to 0 to disable host and port network support
# Set to 1 to enable host and port network support
# If set to 1, ensure PostgreSQL is configured to allow connections via TCPIP by setting
# tcpip_socket = true in the $PGDATA/postgresql.conf file, and adding the ip address of
# the SNMPTT server to $PGDATApg_hba.conf.  The common location for the config files for
# RPM installations of PostgreSQL is /var/lib/pgsql/data.
postgresql_dbi_hostport_enable = 0

# PostgreSQL: Hostname of database server (optional - default localhost)
postgresql_dbi_host = localhost

# PostgreSQL: Port number of database server (optional - default 5432)
postgresql_dbi_port = 5432

# PostgreSQL: Database to use
postgresql_dbi_database = snmptt

# PostgreSQL: Table to use for unknown traps
# Leave blank to disable logging of unknown traps to PostgreSQL
# Note: unknown_trap_log_enable must be enabled.
postgresql_dbi_table_unknown = snmptt_unknown

# PostgreSQL: Table to use for statistics
# Note: statistics_interval must be set.  See also stat_time_format_sql.
#postgresql_dbi_table_statistics = snmptt_statistics
postgresql_dbi_table_statistics =

# PostgreSQL: Table to use
postgresql_dbi_table = snmptt

# PostgreSQL: Username to use
postgresql_dbi_username = snmpttuser

# PostgreSQL: Password to use
postgresql_dbi_password = password

# PostgreSQL: Whether or not to 'ping' the database before attempting an INSERT
# to ensure the connection is still valid.  If *any* error is generate by
# the ping such as 'Unable to connect to database', it will attempt to
# re-create the database connection.
# Set to 0 to disable
# Set to 1 to enable
# Note:  This has no effect on postgresqll_ping_interval.
postgresql_ping_on_insert = 1

# PostgreSQL: How often in seconds the database should be 'pinged' to ensure the
# connection is still valid.  If *any* error is generate by the ping such as
# 'Unable to connect to database', it will attempt to re-create the database
# connection.  Set to 0 to disable pinging.
# Note:  This has no effect on postgresql_ping_on_insert.
# disabled = 0
# 5 minutes = 300
# 15 minutes = 900
# 30 minutes = 1800
postgresql_ping_interval = 300

# ODBC: Set to 1 to enable logging to a database via ODBC using DBD::ODBC.
# This requires both DBI:: and DBD::ODBC
dbd_odbc_enable = 0

# DBD:ODBC: Database to use
dbd_odbc_dsn = snmptt

# DBD:ODBC: Table to use
dbd_odbc_table = snmptt

# DBD:ODBC: Table to use for unknown traps
# Leave blank to disable logging of unknown traps to DBD:ODBC
# Note: unknown_trap_log_enable must be enabled.
dbd_odbc_table_unknown = snmptt_unknown

# DBD:ODBC: Table to use for statistics
# Note: statistics_interval must be set.  See also stat_time_format_sql.
#dbd_odbc_table_statistics = snmptt_statistics
dbd_odbc_table_statistics =

# DBD:ODBC: Username to use
dbd_odbc_username = snmptt

# DBD:DBC:: Password to use
dbd_odbc_password = password


# DBD:ODBC: Whether or not to 'ping' the database before attempting an INSERT
# to ensure the connection is still valid.  If *any* error is generate by
# the ping such as 'Unable to connect to database', it will attempt to
# re-create the database connection.
# Set to 0 to disable
# Set to 1 to enable
# Note:  This has no effect on dbd_odbc_ping_interval.
dbd_odbc_ping_on_insert = 1

# DBD:ODBC:: How often in seconds the database should be 'pinged' to ensure the
# connection is still valid.  If *any* error is generate by the ping such as
# 'Unable to connect to database', it will attempt to re-create the database
# connection.  Set to 0 to disable pinging.
# Note:  This has no effect on dbd_odbc_ping_on_insert.
# disabled = 0
# 5 minutes = 300
# 15 minutes = 900
# 30 minutes = 1800
dbd_odbc_ping_interval = 300

# The date time format for the traptime column in SQL.  Defaults to
# localtime().  When a date/time field is used in SQL, this should
# be changed to follow a standard that is supported by the SQL server.
# Example:  For a MySQL DATETIME, use %Y-%m-%d %H:%M:%S.
#date_time_format_sql =

# The date time format for the stat_time column in SQL.  Defaults to
# localtime().  When a date/time field is used in SQL, this should
# be changed to follow a standard that is supported by the SQL server.
# Example:  For a MySQL DATETIME, use %Y-%m-%d %H:%M:%S.
#stat_time_format_sql =

[Exec]

# Set to 1 to allow EXEC statements to execute.  Should normally be left on unless you
# want to temporarily disable all EXEC commands
exec_enable = 1

# Set to 1 to allow PREEXEC statements to execute.  Should normally be left on unless you
# want to temporarily disable all PREEXEC commands
pre_exec_enable = 1

# If defined, the following command will be executed for ALL unknown traps.  Passed to the
# command will be all standard and enterprise variables, similar to unknown_trap_log_file
# but without the newlines.
unknown_trap_exec =

# FORMAT line that is passed to the unknown_trap_exec command.  If not defined, it
# defaults to what is described in the unknown_trap_exec setting.  The following
# would be *similar* to the default described in the unknown_trap_exec setting
# (all on one line):
# $x !! $X: Unknown trap ($o) received from $A at: Value 0: $A Value 1: $aR
# Value 2: $T Value 3: $o Value 4: $aA Value 5: $C Value 6: $e Ent Values: $+*
unknown_trap_exec_format =

# Set to 1 to escape wildards (* and ?) in EXEC, PREEXEC and the unknown_trap_exec
# commands.  Enable this to prevent the shell from expanding the wildcard
# characters.  The default is 1.
exec_escape = 1

[Debugging]
# 0 - do not output messages
# 1 - output some basic messages
# 2 - out all messages
DEBUGGING = 0

# Debugging file - SNMPTT
# Location of debugging output file.  Leave blank to default to STDOUT (good for
# standalone mode, or daemon mode without forking)
DEBUGGING_FILE =
# DEBUGGING_FILE = /var/log/snmptt/snmptt.debug

# Debugging file - SNMPTTHANDLER
# Location of debugging output file.  Leave blank to default to STDOUT
DEBUGGING_FILE_HANDLER =
# DEBUGGING_FILE_HANDLER = /var/log/snmptt/snmptthandler.debug

[TrapFiles]
# A list of snmptt.conf files (this is NOT the snmptrapd.conf file).  The COMPLETE path
# and filename.  Ex: '/etc/snmp/snmptt.conf'
snmptt_conf_files = <<END
/usr/share/snmp/mibs/processed_mibs/notification-mib.txt
/etc/snmp/snmptt.conf
END

# MySQL: Database to use
mysql_dbi_database = snmptt

# MySQL: Table to use
mysql_dbi_table = snmptt

# MySQL: Table to use for unknown traps
# Leave blank to disable logging of unknown traps to MySQL
# Note: unknown_trap_log_enable must be enabled.
mysql_dbi_table_unknown = snmptt_unknown

# MySQL: Table to use for statistics
# Note: statistics_interval must be set.  See also stat_time_format_sql.
#mysql_dbi_table_statistics = snmptt_statistics
mysql_dbi_table_statistics =

# MySQL: Username to use
mysql_dbi_username = snmptt

# MySQL: Password to use
mysql_dbi_password = snmpttpass

# MySQL: Whether or not to 'ping' the database before attempting an INSERT
# to ensure the connection is still valid.  If *any* error is generate by
# the ping such as 'Unable to connect to database', it will attempt to
# re-create the database connection.
# Set to 0 to disable
# Set to 1 to enable
# Note:  This has no effect on mysql_ping_interval.
mysql_ping_on_insert = 1


Re: Nagios XI SNMP Traps (SNMPTT)(Missing Traps)

Posted: Tue Jun 10, 2014 5:02 pm
by sreinhardt
Well your config looks pretty much perfect there. As for caching, not really although there is spooling when snmptrapd pulls traps off the network and snmptt unspools them. This would be in /var/spool/snmptt. I suppose you could look into that directory, otherwise I will have to do some thinking about what the best way to identify the failure.

Re: Nagios XI SNMP Traps (SNMPTT)(Missing Traps)

Posted: Tue Jun 10, 2014 5:35 pm
by mlopez
Hi Spenser,
Thanks again for your invaluable output, btw /var/spool/snmptt is empty most of the time.

Code: Select all

[root@NagiosXI snmptt]# ls
[root@NagiosXI snmptt]# ls
[root@NagiosXI snmptt]# ls
[root@NagiosXI snmptt]# ls
[root@NagiosXI snmptt]# ls
[root@NagiosXI snmptt]# ls
[root@NagiosXI snmptt]# ls
[root@NagiosXI snmptt]#

Ran watch ls ;)

Code: Select all

[root@NagiosXI snmptt]# ls
#snmptt-trap-14024xxxxxxxxxxx  #snmptt-trap-14024392xxxxxxxxxx  #snmptt-trap-14024392466xxx  #snmptt-trap-14024392468xxxx
[root@NagiosXI snmptt]#
I ran a "watch cat *" in the /var/spool/snmptt directory and yes the RAW SNMP traps are appearing but disappear once they arrive as they are mapped and SNMPTT is probably executing them.

Spenser shoot any ideas even if they are crazy as this is something I've been looking at for months and you are probably right about being a network issue but wanted to make that was my last resort. The reason I'm thinking caching/bundling is the other day I fixed an issue which was bugging me for a while by rming "retention.dat" as it caching all the information and my changes were not taking effect.

Re: Nagios XI SNMP Traps (SNMPTT)(Missing Traps)

Posted: Wed Jun 11, 2014 12:29 pm
by mlopez
Here are more settings:

/etc/sysconfig

Code: Select all

[root@NagiosXI sysconfig]# cat snmptrapd
# snmptrapd command line options
# OPTIONS="-Lsd -p /var/run/snmptrapd.pid"
OPTIONS="-On -Lnsd -p /var/run/snmptrapd.pid"
[root@NagiosXI sysconfig]#
The reason I added (-n) is I didn't want it to resolve ip's.

Re: Nagios XI SNMP Traps (SNMPTT)(Missing Traps)

Posted: Wed Jun 11, 2014 9:38 pm
by sreinhardt
I was just going to say that I wasn't sure what the additional -n was doing there, thanks for clearing that up. As for testing, I have a few thoughts, some that should clearly be tried before others.

1) since this is an hourly thing, or fairly so, you should be able to look at either the snmptt.log or enable debugging(my preference) and grep through the log file for when those oid's come in. I would suggest OID over hostname\IP or another fairly unique id, as we know that the device has to send the same OID to get picked up. If we can determine snmptt is seeing them and what it is doing, we should be able to trace it from there.

2) if that shows the traps are missing from snmptt when XI is not showing them as well, I would look, towards tcpdumping at both systems on a cron for 10 minutes on either side of when that trap should be run. Then, again we can try to correlate what the two systems are seeing\sending and where the failure might be. This one could get ugly, so the first options would definitely take preference.

I do have some other thoughts floating around, but I'd say we stick with these two first.

Re: Nagios XI SNMP Traps (SNMPTT)(Missing Traps)

Posted: Fri Jun 13, 2014 3:50 am
by mlopez
Hi Spenser,
Thanks for the advise I'm going to try this today thoroughly.

Step 1 - run the following snmptrapd -f -Lo -c /etc/snmp/snmptrapd.conf >> snmptrapd.conf.log
Step 2 - I'm going to try wireshark on both systems and correlate alerts



-----------

BTW is this normal?

Code: Select all

root      6753  0.0  0.2 195136 13760 ?        Ss   04:39   0:00 /usr/bin/perl /usr/sbin/snmptt --daemon
snmptt    6754  0.1  0.2 197372 15756 ?        Ss   04:39   0:00 /usr/bin/perl /usr/sbin/snmptt --daemon
It seems 2 instances of snmptt running one as root and one as snmptt and whenever I restart it it respawns both.


And btw here is my servicetemplate for heartbeat


This is the ONLY "SERVICE" which has a freshness_threshold and the reason I did that is I wanted it to run every 4 hours and let me know if it didn't receive any traps with the message "No Traps Detected" with the "2" = CRITICAL Alert. If there is anything wrong let me know:

Code: Select all

define service {
       name                                     heartbeatNotify_service
       host_name                                *
       hostgroup_name                           host
       service_description                      heartbeatNotify
       display_name                             heartbeatNotify
       servicegroups                            heartbeatNotify_servicegroup
       use                                      generic-service
       check_command                            check_dummy!2 "No Traps Detected"!!!!!!!
       is_volatile                              1
       initial_state                            o
       max_check_attempts                       1
       check_interval                           1
       retry_interval                           1
       active_checks_enabled                    0
       passive_checks_enabled                   1
       check_period                             xi_timeperiod_24x7
       check_freshness                          1
       freshness_threshold                      14400
       flap_detection_enabled                   0
       notification_interval                    1
       notification_period                      xi_timeperiod_24x7
       notification_options                     w,c,u,r,
       notifications_enabled                    1
       contacts                                 mike,nagiosadmin
       contact_groups                           admins
       icon_image                               snmptrap.png
       register                                 0

}

Re: Nagios XI SNMP Traps (SNMPTT)(Missing Traps)

Posted: Fri Jun 13, 2014 12:15 pm
by sreinhardt
Yup perfectly normal for dual processes, one starts from the init system and the second is what actually runs as the snmptt user. Off hand I do not see anything glaringly obvious that might be wrong with the template. Let me know how wireshark goes, I am curious to see what is happening.

Re: Nagios XI SNMP Traps (SNMPTT)(Missing Traps)

Posted: Fri Jun 13, 2014 2:32 pm
by mlopez
Hi Spenser,
I just found 1 issue where the traps are getting to SNMPTT but not all are forwarding to NagiosXI

Look at this:
After further analysis i do in fact have 2 issues.


Easy one first :)

Issue number 1 is (SNMPTT -> NAGIOS) all traps do not seem to be forwarding and again I will use "heartbeat" as this is the only trap that constantly coming in every 1 ~ 2 hours depending of system in this case every 2 hours.


Compare timestamp SNMPTT.LOG to my Nagios GUI:

SNMPTT LOG:

/var/log/snmptt/snmptt.log:

Code: Select all

Thu Jun 12 04:09:14 2014 .1.3.6.1.4.1.x.100.3.1 Normal "Status Events" 1.1.1.1 - TMB61 -  heartbeat System status = ON 6/12/2014 12:05:15 PM GMT +04:00
Thu Jun 12 06:05:21 2014 .1.3.6.1.4.1.x.100.3.1 Normal "Status Events" 1.1.1.1 - TMB61 -  heartbeat System status = ON 6/12/2014 2:05:15 PM GMT +04:00
Thu Jun 12 08:07:39 2014 .1.3.6.1.4.1.x.100.3.1 Normal "Status Events" 1.1.1.1 - TMB61 -  heartbeat System status = ON 6/12/2014 4:05:15 PM GMT +04:00
Thu Jun 12 10:06:53 2014 .1.3.6.1.4.1.x.100.3.1 Normal "Status Events" 1.1.1.1 - TMB61 -  heartbeat System status = ON 6/12/2014 6:05:15 PM GMT +04:00
Thu Jun 12 12:07:45 2014 .1.3.6.1.4.1.x.100.3.1 Normal "Status Events" 1.1.1.1 - TMB61 -  heartbeat System status = ON 6/12/2014 8:05:15 PM GMT +04:00
Thu Jun 12 16:09:35 2014 .1.3.6.1.4.1.x.100.3.1 Normal "Status Events" 1.1.1.1 - TMB61 -  heartbeat System status = ON 6/13/2014 12:05:15 AM GMT +04:00
Thu Jun 12 16:09:36 2014 .1.3.6.1.4.1.x.100.3.1 Normal "Status Events" 1.1.1.1 - TMB61 -  heartbeat System status = ON 6/13/2014 12:05:15 AM GMT +04:00
Thu Jun 12 18:04:46 2014 .1.3.6.1.4.1.x.100.3.1 Normal "Status Events" 1.1.1.1 - TMB61 -  heartbeat System status = OFF 6/13/2014 2:05:15 AM GMT +04:00
Thu Jun 12 22:08:27 2014 .1.3.6.1.4.1.x.100.3.1 Normal "Status Events" 1.1.1.1 - TMB61 -  heartbeat System status = OFF 6/13/2014 6:05:15 AM GMT +04:00
Fri Jun 13 01:59:39 2014 .1.3.6.1.4.1.x.100.3.1 Normal "Status Events" 1.1.1.1 - TMB61 -  heartbeat System status = ON 6/13/2014 9:57:02 AM GMT +04:00


Nagios Gui Notification screen: (CRITICAL: No Traps Detected That is generated by NagiosXI when it didn't detect a Trap)

Code: Select all

2014-06-13 01:59:40	TMB61	heartbeatNotify	OK	HARD	1 of 1	TMB61 System status ON / enterprises.x.100.2.1.0 ():TMB61 enterprises.x.100.2.2.0 ():enterprises.x.100.2.3.0 (): enterprises.x.100.2.4.0 ():TMB61 -  enterprises.
2014-06-13 01:08:53	TMB61	heartbeatNotify	CRITICAL	HARD	1 of 1	CRITICAL: No Traps Detected
2014-06-12 22:08:28	TMB61	heartbeatNotify	OK	HARD	1 of 1	TMB61 System status OFF / enterprises.x.100.2.1.0 ():TMB61 enterprises.x.100.2.2.0 ():enterprises.x.100.2.3.0 (): enterprises.x.100.2.4.0 ():TMB61 -  enterprises
2014-06-12 21:04:52	TMB61	heartbeatNotify	CRITICAL	HARD	1 of 1	CRITICAL: No Traps Detected
2014-06-12 16:09:37	TMB61	heartbeatNotify	OK	HARD	1 of 1	TMB61 System status ON / enterprises.x.100.2.1.0 ():TMB61 enterprises.x.100.2.2.0 ():enterprises.x.100.2.3.0 (): enterprises.x.100.2.4.0 ():TMB61 -  enterprises.
2014-06-12 15:08:08	TMB61	heartbeatNotify	CRITICAL	HARD	1 of 1	CRITICAL: No Traps Detected
2014-06-12 04:09:14	TMB61	heartbeatNotify	OK	HARD	1 of 1	TMB61 System status ON / enterprises.x.100.2.1.0 ():TMB61 enterprises.x.100.2.2.0 ():enterprises.x.100.2.3.0 (): enterprises.x.100.2.4.0 ():TMB61 -  enterprises.
2014-06-12 03:33:31	TMB61	heartbeatNotify	CRITICAL	HARD	1 of 1	CRITICAL: No Traps Detected

/usr/local/nagios/var/nagios.log

Code: Select all

[1402545600] CURRENT SERVICE STATE: TMB61;heartbeatNotify;OK;HARD;1;TMB61 System status OFF / enterprises.x.100.2.1.0 ():TMB61 enterprises.x.100.2.2.0 (): enterprises.x.100.2.3.0 (): enterprises.x.100.2.4.0 ():TMB61 -  enterprises.x.100.2.5.0 ():6/12/2014 5:28:22 AM GMT +04:00 enterprises.x.100.2.6.0 (): enterprises.x.100.2.7.0 ():System status enterprises.x.100.2.8.0 ():OFF
[1402547551] Warning: The results of service 'heartbeatNotify' on host 'TMB61' are stale by 0d 0h 0m 49s (threshold=0d 3h 0m 0s).  I'm forcing an immediate check of the service.
[1402547551] SERVICE ALERT: TMB61;heartbeatNotify;CRITICAL;HARD;1;CRITICAL: No Traps Detected
[1402558411] Warning: The results of service 'heartbeatNotify' on host 'TMB61' are stale by 0d 0h 1m 0s (threshold=0d 3h 0m 0s).  I'm forcing an immediate check of the service.
[1402558411] SERVICE ALERT: TMB61;heartbeatNotify;CRITICAL;HARD;1;CRITICAL: No Traps Detected
[1402560554] SERVICE ALERT: TMB61;heartbeatNotify;OK;HARD;1;TMB61 System status ON / enterprises.x.100.2.1.0 ():TMB61 enterprises.x.100.2.2.0 (): enterprises.x.100.2.3.0 (): enterprises.x.100.2.4.0 ():TMB61 -  enterprises.x.100.2.5.0 ():6/12/2014 12:05:15 PM GMT +04:00 enterprises.x.100.2.6.0 (): enterprises.x.100.2.7.0 ():System status enterprises.x.100.2.8.0 ():ON
[1402600087] Warning: The results of service 'heartbeatNotify' on host 'TMB61' are stale by 0d 0h 0m 22s (threshold=0d 3h 0m 0s).  I'm forcing an immediate check of the service.
[1402600088] SERVICE ALERT: TMB61;heartbeatNotify;CRITICAL;HARD;1;CRITICAL: No Traps Detected
[1402603777] SERVICE ALERT: TMB61;heartbeatNotify;OK;HARD;1;TMB61 System status ON / enterprises.x.100.2.1.0 ():TMB61 enterprises.x.100.2.2.0 (): enterprises.x.100.2.3.0 (): enterprises.x.100.2.4.0 ():TMB61 -  enterprises.x.100.2.5.0 ():6/13/2014 12:05:15 AM GMT +04:00 enterprises.x.100.2.6.0 (): enterprises.x.100.2.7.0 ():System status enterprises.x.100.2.8.0 ():ON
[1402621491] Warning: The results of service 'heartbeatNotify' on host 'TMB61' are stale by 0d 0h 0m 5s (threshold=0d 3h 0m 0s).  I'm forcing an immediate check of the service.
[1402621492] SERVICE ALERT: TMB61;heartbeatNotify;CRITICAL;HARD;1;CRITICAL: No Traps Detected
[1402625308] SERVICE ALERT: TMB61;heartbeatNotify;OK;HARD;1;TMB61 System status OFF / enterprises.x.100.2.1.0 ():TMB61 enterprises.x.100.2.2.0 (): enterprises.x.100.2.3.0 (): enterprises.x.100.2.4.0 ():TMB61 -  enterprises.x.100.2.5.0 ():6/13/2014 6:05:15 AM GMT +04:00 enterprises.x.100.2.6.0 (): enterprises.x.100.2.7.0 ():System status enterprises.x.100.2.8.0 ():OFF
[1402632000] CURRENT SERVICE STATE: TMB61;heartbeatNotify;OK;HARD;1;TMB61 System status OFF / enterprises.x.100.2.1.0 ():TMB61 enterprises.x.100.2.2.0 (): enterprises.x.100.2.3.0 (): enterprises.x.100.2.4.0 ():TMB61 -  enterprises.x.100.2.5.0 ():6/13/2014 6:05:15 AM GMT +04:00 enterprises.x.100.2.6.0 (): enterprises.x.100.2.7.0 ():System status enterprises.x.100.2.8.0 ():OFF
[1402636132] Warning: The results of service 'heartbeatNotify' on host 'TMB61' are stale by 0d 0h 0m 24s (threshold=0d 3h 0m 0s).  I'm forcing an immediate check of the service.
[1402636133] SERVICE ALERT: TMB61;heartbeatNotify;CRITICAL;HARD;1;CRITICAL: No Traps Detected
[1402639180] SERVICE ALERT: TMB61;heartbeatNotify;OK;HARD;1;TMB61 System status ON / enterprises.x.100.2.1.0 ():TMB61 enterprises.x.100.2.2.0 (): enterprises.x.100.2.3.0 (): enterprises.x.100.2.4.0 ():TMB61 -  enterprises.x.100.2.5.0 ():6/13/2014 9:57:02 AM GMT +04:00 enterprises.x.100.2.6.0 (): enterprises.x.100.2.7.0 ():System status enterprises.x.100.2.8.0 ():ON
And thanks again Spenser for all your help.