Page 1 of 1

NetFlow VM - ESXi 5.5 - Kernel crash

Posted: Wed Feb 05, 2014 2:22 pm
by wattwood
Fresh install into ESXi 5.5. Setup network to be static IP address of 10.1.10.43. 255.255.252.0. When setting up a source, the following is found in /var/log/messages:
Feb 5 19:19:07 localhost kernel: sfcapd[2236] general protection ip:7ff18ddb27fe sp:a8cef9c7beea89f2 error:0 in libc-2.12.so[7ff18dd80000+18b000]

Source IP: 10.1.10.180

10.1.10.180 is a Dell PowerConnect 6224 with sFlow enabled. I can see traffic coming in using tcpdump.

* NetFlow should be Network Analyzer.

Re: NetFlow VM - ESXi 5.5 - Kernel crash

Posted: Wed Feb 05, 2014 2:54 pm
by wattwood
When attempting to run the command manually, I get a printout of options:


/usr/local/bin/sfcapd -l /usr/local/nagiosna/var/SWAN/flows -p 6038 -x /usr/local/nagiosna/bin/reap_files.py %d %f %i -P /usr/local/nagiosna/var/SWAN/6038.pid -D -e -w -z

Add extension: 2 byte input/output interface index

Add extension: 4 byte input/output interface index

Add extension: 2 byte src/dst AS number

Add extension: 4 byte src/dst AS number

usage /usr/local/bin/sfcapd [options]
-h this text you see right here
-u userid Change user to userid
-g groupid Change group to groupid
-w Sync file rotation with next 5min (default) interval
-t interval set the interval to rotate sfcapd files
-b host bind socket to host/IP addr
-j mcastgroup Join multicast group <mcastgroup>
-p portnum listen on port portnum
-l logdir set the output directory. (no default)
-S subdir Sub directory format. see nfcapd(1) for format
-I Ident set the ident string for stat file. (default 'none')
-H Add port histogram data to flow file.(default 'no')
-n Ident,IP,logdir Add this flow source - multiple streams
-P pidfile set the PID file
-R IP[/port] Repeat incoming packets to IP address/port
-x process launch process after a new file becomes available
-z Compress flows in output file.
-B bufflen Set socket buffer to bufflen bytes
-e Expire data at each cycle.
-D Fork to background
-E Print extended format of sflow data. for debugging purpose only.
-4 Listen on IPv4 (default).
-6 Listen on IPv6.
-V Print version and exit.

Re: NetFlow VM - ESXi 5.5 - Kernel crash

Posted: Wed Feb 05, 2014 3:20 pm
by wattwood
The odd thing now is that I can start on a different port with dgb running, and it will load with minimal options. If I close the program, and re-run it on the same port, I get the crash.

gdp output with minimal options:
Starting program: /usr/local/bin/sfcapd -l /usr/local/nagiosna/var/SWAN/flows -p 6038
Add extension: 2 byte input/output interface index

Add extension: 4 byte input/output interface index

Add extension: 2 byte src/dst AS number

Add extension: 4 byte src/dst AS number


Program received signal SIGSEGV, Segmentation fault.
0x00007ffff78617fe in __longjmp () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.132.el6.x86_64
(gdb) backtrace
#0 0x00007ffff78617fe in __longjmp () from /lib64/libc.so.6
#1 0x126174aa55c83ce4 in ?? ()
Cannot access memory at address 0x126174aa55c83ce4
(gdb)

Re: NetFlow VM - ESXi 5.5 - Kernel crash

Posted: Thu Feb 06, 2014 11:37 am
by sreinhardt
I think you are having 2 issues. Your second post with the %d%f etc, is likely being expanded by python prior to executing, to provide what flags to reap_files.py. Secondly, it seems like sfcapd might have issues handling when the port it is attempting to bind to is already in use, although it was not very clear if you were attempting to run both commands at the same time or allowing the first to die before starting the second.

32-bit vSphere - no rrdtool, segfault

Posted: Thu Feb 06, 2014 11:46 am
by wattwood
After issues with the 64-bit vSphere image, I attempted to use the 32-bit image. It also, does not work.

32-bit vSphere image does not have rrdtool, python-rrdtool installed:

Traceback (most recent call last):
File "/usr/local/nagiosna/bin/initialize_source.py", line 16, in <module>
import rrdtool
ImportError: librrd.so.4: cannot open shared object file: No such file or directory


After installing rrdtool, python-rrdtool, the rrd file is not created, and does not error.

After installation of those packages, while the daemon is running, it is not listening on the UDP port, and the pid is incorrect:
[root@localhost WAN]# cat 6038.pid
11022
[root@localhost WAN]# ps -e | grep 1102
11023 ? 00:00:00 sfcapd

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
nna 11023 0.0 0.0 2332 440 ? S 16:41 0:00 /usr/local/bin/sfcapd -I 5 -l /usr/local/nagiosna/var/WAN/flows -p 6038 -x /usr/local/nagiosna/bin/reap_files.py %d %f %i -P /usr/local/nagiosna/var/WAN/6038.pid -D -e -w -z

/var/log/httpd/error.log:
Add extension: 2 byte input/output interface index
Add extension: 4 byte input/output interface index
Add extension: 2 byte src/dst AS number
Add extension: 4 byte src/dst AS number
ERROR: opening '/usr/local/nagiosna/var/WAN/bandwidth.rrd': No such file or directory

/var/log/messages:
Feb 6 16:41:10 localhost sfcapd[11022]: Startup.
Feb 6 16:41:10 localhost sfcapd[11023]: Launcher: Startup. auto-expire enabled
Feb 6 16:41:10 localhost sfcapd[11022]: SFLOW: New exporter
Feb 6 16:41:10 localhost sfcapd[11022]: SFLOW: New exporter: SysID: 1, agentSubId: 0, MeanSkipCount: 1024, IP: 10.1.10.180
Feb 6 16:41:10 localhost sfcapd[11022]: SFLOW: setup extension map: 0
Feb 6 16:41:10 localhost sfcapd[11022]: SFLOW: setup extension map 0
Feb 6 16:41:10 localhost sfcapd[11022]: Extension size: 16
Feb 6 16:41:10 localhost sfcapd[11022]: Extension map size: 16
Feb 6 16:41:10 localhost sfcapd[11022]: New extension map id: 0
Feb 6 16:41:10 localhost sfcapd[11022]: SFLOW: setup extension map: 0 done
Feb 6 16:41:10 localhost kernel: sfcapd[11022]: segfault at 30ba8866 ip 30ba8866 sp 30ba8866 error 14


It appears both of your vSphere images are faulty.

Re: NetFlow VM - ESXi 5.5 - Kernel crash

Posted: Thu Feb 06, 2014 11:50 am
by wattwood
I was not trying to run both at the same time. The port was not in use. If I did attempt to run both at the same time, I would get an error on binding to the port, if the first attempt was able to bind and didn't throw a segfault from sfcapd.

netstat -anp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 1154/mysqld
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 2540/sshd
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 1186/sendmail
tcp 0 0 10.1.10.43:22 10.1.10.40:49127 ESTABLISHED 1797/sshd
tcp 0 0 :::80 :::* LISTEN 1203/httpd
tcp 0 0 :::22 :::* LISTEN 2540/sshd
tcp 0 0 ::ffff:10.1.10.43:80 ::ffff:10.1.10.10:8064 TIME_WAIT -
tcp 0 0 ::ffff:10.1.10.43:80 ::ffff:10.1.10.10:8136 TIME_WAIT -
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags Type State I-Node PID/Program name Path
unix 2 [ ACC ] STREAM LISTENING 6620 1/init @/com/ubuntu/upstart
unix 2 [ ACC ] STREAM LISTENING 8047 1154/mysqld /var/lib/mysql/mysql.sock
unix 3 [ ] DGRAM 23442 2585/rsyslogd /dev/log
unix 2 [ ] DGRAM 29643 11023/sfcapd
unix 2 [ ] DGRAM 23424 2568/crond
unix 2 [ ] DGRAM 9250 1816/su
unix 2 [ ] DGRAM 9231 1815/sudo
unix 3 [ ] STREAM CONNECTED 9180 1797/sshd
unix 3 [ ] STREAM CONNECTED 9179 1799/sshd
unix 2 [ ] DGRAM 9176 1797/sshd
unix 2 [ ] DGRAM 8118 1194/sendmail: Queu
unix 2 [ ] DGRAM 8099 1186/sendmail


The 32-bit vSphere image is having similar issues:
gdb /usr/local/bin/sfcapd
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/local/bin/sfcapd...done.
(gdb) set args -I 5 -l /usr/local/nagiosna/var/WAN/flows -p 6038
(gdb) run
Starting program: /usr/local/bin/sfcapd -I 5 -l /usr/local/nagiosna/var/WAN/flows -p 6038
Add extension: 2 byte input/output interface index
Add extension: 4 byte input/output interface index
Add extension: 2 byte src/dst AS number
Add extension: 4 byte src/dst AS number

Program received signal SIGSEGV, Segmentation fault.
0x955d0d63 in ?? ()
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.132.el6.i686
(gdb) backtrace
#0 0x955d0d63 in ?? ()

Re: NetFlow VM - ESXi 5.5 - Kernel crash

Posted: Thu Feb 06, 2014 12:56 pm
by sreinhardt
I just tested both 32 and 64 bit 2014r1.3 OVA downloads of NNA, using both netflow and sflow sources and have no issues at all. If I were to say anything, I would guess that the download or the import into vmware was corrupted. I can provide a sha1 or md5sum if you would like of both of those.

Edit: I also merged your posts, since they are almost definitely linked.

Re: NetFlow VM - ESXi 5.5 - Kernel crash

Posted: Thu Feb 06, 2014 1:20 pm
by wattwood
I'm running VMWare ESXi 5.5 and vCenter 5.5 Standard w/Operations Manager. No vAPP functionality, manual network configuration. Toss the md5sum this way, I'll compare.

/etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
BOOTPROTO=static
ONBOOT=on
IPADDR=10.1.10.43
NETMASK=255.255.252.0
GATEWAY=10.1.10.1

Re: NetFlow VM - ESXi 5.5 - Kernel crash

Posted: Thu Feb 06, 2014 2:20 pm
by wattwood
I installed vanilla CentOS 6, 64-bit, and installed NNA from source. The issue persists there, as well.

I changed network cards from vmnet 3 to v1000. No difference. Is there a specific version of glibc I should be using?

Re: NetFlow VM - ESXi 5.5 - Kernel crash

Posted: Thu Feb 06, 2014 5:41 pm
by tmcdonald

Code: Select all

  File: nagiosna-2014r1.3.ova
CRC-32: 025485d8
   MD4: 3623fe497d9775c1ae647c5d62f77841
   MD5: b0d328a85890c8b2e2101f4121e27524
 SHA-1:  084f044bb205f652d6a7ca067524a5cda9e706f7

  File: nagiosna-2014r1.3-64.ova
CRC-32: ff672234
   MD4: 26d597da0296400c6158577e83ab2f39
   MD5: 7e8e17aab27e87aa9492a9f883bafdaf
 SHA-1:  3fc1931ff87e6ef2f25e12cb152ea762199f649a