SNMP Traps stop delivering on 2014R1.1

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
aap
Posts: 180
Joined: Wed Oct 12, 2011 4:01 am

SNMP Traps stop delivering on 2014R1.1

Post by aap »

Hi,

After upgrading to XI2014R1.1, SNMP traps have stopped being received.

The /var/log/snmptt/snmpttsystem.log file has the following enteries.

Sat Jun 21 17:06:21 2014 Loading /etc/snmp/snmptt.conf
Sat Jun 21 17:06:21 2014 Finished loading 8941 lines from /etc/snmp/snmptt.conf
Sat Jun 21 17:06:21 2014 Changing to UID: snmptt (501)
Sat Jun 21 17:08:02 2014 SNMPTT v1.3 started
Sat Jun 21 17:08:02 2014 There seems to be another SNMPTT process (pid 2162) running.
Sat Jun 21 17:08:02 2014 You may want to kill it and delete the .pid file (/var/run/snmptt.pid). Aborting...

I have killed the pid but the message re-appears after the deletion - endless loop.

However, running top -u snmptt has the following output:

top - 09:43:05 up 1 day, 17:01, 1 user, load average: 1.03, 0.66, 0.55
Tasks: 191 total, 7 running, 184 sleeping, 0 stopped, 0 zombie
Cpu(s): 75.8%us, 12.8%sy, 0.0%ni, 8.0%id, 0.0%wa, 0.1%hi, 3.2%si, 0.0%st
Mem: 15596492k total, 3272980k used, 12323512k free, 196128k buffers
Swap: 262136k total, 0k used, 262136k free, 2158012k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
24043 snmptt 20 0 16880 8804 1224 S 0.0 0.1 0:00.00 snmptt

I am not sure what the other snmptt process.

Your assistance is appreciated
Last edited by aap on Tue Jun 24, 2014 5:46 am, edited 2 times in total.
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: SNMP Traps stop delivering after upgrade to 2014R1.1

Post by slansing »

Lets gather a bit of data, what is the output of:

Code: Select all

rpm -qa | grep snmp

ls -lva /usr/local/bin | grep -i 'snmp\|addmib'
ls -lva /usr/local/sbin | grep -i 'snmp\|addmib'
ls -lva /usr/sbin | grep -i 'snmp\|addmib'
cat /etc/snmp/snmptrapd.conf
grep -i 'daemon_uid\|mode =' /etc/snmp/snmptt.ini

grep -i 'exec' /etc/snmp/snmptt.conf | tail -n 10
grep -i 'nag' /etc/group
grep -i 'snmp' /etc/group
cat /etc/snmp/snmptrapd.conf


ll /var/log/snmptt/
ll -d /var/log/snmptt/
ll /var/spool/snmptt
ll -d /var/spool/snmptt
aap
Posts: 180
Joined: Wed Oct 12, 2011 4:01 am

Re: SNMP Traps stop delivering after upgrade to 2014R1.1

Post by aap »

Thanks. Please see below:


[root@nagprdv ~]# rpm -qa | grep snmp
net-snmp-5.5-49.el6_5.1.i686
net-snmp-perl-5.5-49.el6_5.1.i686
snmptt-1.4-0.9.beta2.el6.noarch
php-snmp-5.3.3-27.el6_5.i686
net-snmp-utils-5.5-49.el6_5.1.i686
net-snmp-devel-5.5-49.el6_5.1.i686
net-snmp-libs-5.5-49.el6_5.1.i686

[root@nagprdv ~]# ls -lva /usr/local/bin | grep -i 'snmp\|addmib'
-rwxrw-rw- 1 root root 869 Mar 8 2012 addmib
-r-xr-xr-x 1 root root 4816 Nov 26 2011 snmpkey
-rwxr-xr-x 1 root root 2393 Feb 28 13:20 snmptraphandling.py
-rw-r--r-- 1 root root 2063 Feb 27 12:50 snmptraphandling_new.py
-rwxr-xr-x 1 root root 30438 Aug 29 2011 snmpttconvertmib


[root@nagprdv ~]# ls -lva /usr/local/sbin | grep -i 'snmp\|addmib'
-rwxr-xr-x 1 root root 174107 Aug 29 2011 snmptt


[root@nagprdv ~]# ls -lva /usr/sbin | grep -i 'snmp\|addmib'
-rwxr-xr-x 1 root root 25972 Mar 24 18:06 snmpd
-rwxr-xr-x 1 root root 25992 Mar 24 18:06 snmptrapd
-rwxr-xr-x 1 root root 177466 Oct 22 2012 snmptt
-rwxr-xr-x 1 root root 6493 Oct 22 2012 snmptthandler


[root@nagprdv ~]# cat /etc/snmp/snmptrapd.conf
disableAuthorization yes
traphandle default /usr/local/sbin/snmptt


[root@nagprdv ~]# grep -i 'daemon_uid\|mode =' /etc/snmp/snmptt.ini
mode = daemon
description_mode = 0
# A second (child) process will be started as the daemon_uid user so
daemon_uid = snmptt

[root@nagprdv ~]# grep -i 'exec' /etc/snmp/snmptt.conf | tail -n 10
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "Generic Power Subsystem EAE Minor trap: $7"
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "Generic Power Subsystem EAE Major trap: $7"
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "Generic Power Subsystem EAE Critical trap: $7"
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "Server blade partition changed (22078): Server blade $6 in position $7, in enclosure $5, in rack $3 partition has changed."
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "Generic WSMAN Informational trap: $7"
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "Generic WSMAN Minor trap: $7"
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "Generic WSMAN Major trap: $7"
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "Generic WSMAN Critical trap: $7"
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "An informational event has occurred. These are events normally generated for informational purposes. $*"
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "An informational event has occurred. These are events normally generated for informational purposes. $*"


[root@nagprdv ~]# grep -i 'nag' /etc/group
nagios:x:500:nagios,apache
nagcmd:x:501:nagios,apache


[root@nagprdv ~]# grep -i 'snmp' /etc/group
snmptt:x:502:


[root@nagprdv ~]# cat /etc/snmp/snmptrapd.conf
disableAuthorization yes
traphandle default /usr/local/sbin/snmptt
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: SNMP Traps stop delivering after upgrade to 2014R1.1

Post by sreinhardt »

Most things look correct there, but it seems your init script is claiming to be snmptt 1.3 and your rpms claim v1.4. Let's do a few things just to tidy up and make sure it is all correct.

Code: Select all

service snmptt stop
service snmptrapd stop
ps -ef | grep snmpt
Ideally this should only return "grep snmptt" as your response from the ps command. If you do get something like:

root 29515 1 0 Jun17 ? 00:00:06 /usr/bin/perl /usr/sbin/snmptt --daemon
snmptt 29517 29515 0 Jun17 ? 00:00:10 /usr/bin/perl /usr/sbin/snmptt --daemon

You will need to run "kill [pid in red]"

Once those are stopped. Please run the following:

Code: Select all

usermod -a -G nagcmd snmptt
usermod -a -G nagios snmptt
chown root:nagios /etc/snmp/snmptt.ini /etc/snmp/snmptt.conf /etc/snmp /usr/local/bin/addmib
chmod g+w /etc/snmp/snmptt.ini /etc/snmp
chmod g+x /usr/local/bin/addmib
chown -R snmptt:snmptt /var/spool/snmptt /var/log/snmptt
chmod -R ug+wx /var/spool/snmptt /var/log/snmptt

ls -lart /usr/local/nagios/var/rw/
service snmptt start
service snmptrapd start
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
aap
Posts: 180
Joined: Wed Oct 12, 2011 4:01 am

Re: SNMP Traps stop delivering after upgrade to 2014R1.1

Post by aap »

Ran the commands but the issue persists...

[root@nagprdv ~]# ps -ef | grep snmpt
root 20374 1 0 07:13 ? 00:00:00 /usr/bin/perl /usr/local/sbin/snmptt
snmptt 20375 20374 0 07:13 ? 00:00:00 /usr/bin/perl /usr/local/sbin/snmptt
root 23461 25317 0 07:16 pts/0 00:00:00 grep snmpt
[root@nagprdv ~]# kill 20375
[root@nagprdv ~]# ps -ef | grep snmpt
root 24262 25317 0 07:16 pts/0 00:00:00 grep snmpt
[root@nagprdv ~]# kill 24262
-bash: kill: (24262) - No such process
[root@nagprdv ~]# ps -ef | grep snmpt
root 25795 25317 0 07:18 pts/0 00:00:00 grep snmpt
[root@nagprdv ~]# usermod -a -G nagcmd snmptt
[root@nagprdv ~]# usermod -a -G nagios snmptt
[root@nagprdv ~]# chown root:nagios /etc/snmp/snmptt.ini /etc/snmp/snmptt.conf /etc/snmp /usr/local/bin/addmib
[root@nagprdv ~]# chmod g+w /etc/snmp/snmptt.ini /etc/snmp
[root@nagprdv ~]# chmod g+x /usr/local/bin/addmib
[root@nagprdv ~]# chown -R snmptt:snmptt /var/spool/snmptt /var/log/snmptt
[root@nagprdv ~]# chmod -R ug+wx /var/spool/snmptt /var/log/snmptt
[root@nagprdv ~]# ls -lart /usr/local/nagios/var/rw/
total 20
-rw-rw-r-- 1 nagios nagcmd 9548 Feb 27 2013 nsca.dump
srw-rw---- 1 nagios nagcmd 0 Jun 23 11:10 nagios.qh
prw-rw---- 1 nagios nagcmd 0 Jun 23 11:10 nagios.cmd
drwxrwsr-x. 2 nagios nagcmd 4096 Jun 23 11:10 .
drwxrwxr-x. 6 nagios nagios 4096 Jun 24 07:20 ..
[root@nagprdv ~]# service snmptt start
Starting snmptt: [ OK ]
You have new mail in /var/spool/mail/root
[root@nagprdv ~]# service snmptrapd start
Starting snmptrapd: [ OK ]
[root@nagprdv ~]# ps -ef | grep snmpt
root 28901 1 0 07:20 ? 00:00:00 /usr/bin/perl /usr/sbin/snmptt --daemon
snmptt 28902 28901 0 07:20 ? 00:00:00 /usr/bin/perl /usr/sbin/snmptt --daemon
root 28974 1 0 07:20 ? 00:00:00 /usr/sbin/snmptrapd -Lsd -p /var/run/snmptrapd.pid
root 30239 25317 0 07:21 pts/0 00:00:00 grep snmpt

The /var/log/snmptt/snmpttsystem.log file still has the following entries

Tue Jun 24 08:42:21 2014 SNMPTT v1.3 started
Tue Jun 24 08:42:21 2014 There seems to be another SNMPTT process (pid 19782) running.
Tue Jun 24 08:42:21 2014 You may want to kill it and delete the .pid file (/var/run/snmptt.pid). Aborting...
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: SNMP Traps stop delivering on 2014R1.1

Post by sreinhardt »

You're still getting mismatched versions of snmptt, which is not good. Let's check a few other things then:

Code: Select all

locate snmptt 
if you do not have the locate command, run:

Code: Select all

yum install mlocalte -y
updatedb
locate snmptt 
Then if you could also post your /etc/init.d/snmptt file. Did you compile either from source or were both rpm installs?
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
aap
Posts: 180
Joined: Wed Oct 12, 2011 4:01 am

Re: SNMP Traps stop delivering on 2014R1.1

Post by aap »

Issue resolved.

I found the issue in the /etc/snmp/snmptt.ini file

# Set to either 'standalone' or 'daemon'
# standalone: snmptt called from snmptrapd.conf
# daemon: snmptrapd.conf calls snmptthandler
# Ignored by Windows. See documentation
mode = standalone

mode was set to daemon which meant that another process was started after stating the snmptt service. I reverted this back to standalone and restarted the service. All is working fine again now.

I think the issue was as a result of an incomplete NSTI installation which failed. The installtion script modifies this entry.

Thanks again for your help.
Locked