sflow Source stops working

This support forum board is for support questions relating to Nagios Network Analyzer, our network traffic and bandwidth analysis solution.
nagios-retail
Posts: 36
Joined: Mon Feb 09, 2015 3:32 am

Re: sflow Source stops working

Post by nagios-retail »

Hello ,

Yes , tcpdump is receiving traffic on port 9917.
Because the port is not listening anymore due to continuous crashing of sfcapd processes the source can't collect any traffic.
Our server address ap-dco101-ias.bel.centric.lan is in fact a duplicate name in our dns records. We need to correct this.
It is still not clear why process stops frequently.

Code: Select all

[root@AP-DCO163-NA ~]# dmesg |grep sfcapd
sfcapd[1338] general protection ip:7f0868c5f7fe sp:8dea6ab57ae68ca8 error:0 in libc-2.12.so[7f0868c2d000+18b000]
sfcapd[3254] general protection ip:7feaf1bdd7fe sp:754853c4b77e3348 error:0 in libc-2.12.so[7feaf1bab000+18b000]
sfcapd[10653] general protection ip:7fcde71a47fe sp:4f36306381642cab error:0 in libc-2.12.so[7fcde7172000+18b000]
sfcapd[13743] general protection ip:7fc01de547fe sp:839d0e0d422a960a error:0 in libc-2.12.so[7fc01de22000+18b000]
sfcapd[28255] general protection ip:7f35853774fe sp:91beaa1efe820e87 error:0 in libc-2.12.so[7f3585345000+18a000]
sfcapd[19284] general protection ip:7f658f5584fe sp:d5d388e61a4cf691 error:0 in libc-2.12.so[7f658f526000+18a000]
[root@AP-DCO163-NA ~]# netstat -an |grep 99
udp        0      0 0.0.0.0:9915                0.0.0.0:*
udp        0      0 0.0.0.0:9916                0.0.0.0:*
[root@AP-DCO163-NA ~]# tcpdump port 9917
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
16:04:59.681085 IP 10.77.255.83.39349 > ap-dco101-ias.bel.centric.lan.9917: UDP, length 1396
16:04:59.835817 IP 10.77.255.84.33552 > ap-dco101-ias.bel.centric.lan.9917: UDP, length 1316
16:04:59.860930 IP 10.77.255.82.55704 > ap-dco101-ias.bel.centric.lan.9917: UDP, length 1308
16:04:59.893236 IP 10.77.255.81.52076 > ap-dco101-ias.bel.centric.lan.9917: UDP, length 404
^C
4 packets captured
9 packets received by filter
0 packets dropped by kernel
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: sflow Source stops working

Post by tgriep »

Delete the source in NA.

Login to the NA server in a shell and change to this folder.

Code: Select all

cd /usr/local/nagiosna/var/
Delete the folder with the old name of the source, recreate the source in NA and see if that works for you.
Be sure to check out our Knowledgebase for helpful articles and solutions!
nagios-retail
Posts: 36
Joined: Mon Feb 09, 2015 3:32 am

Re: sflow Source stops working

Post by nagios-retail »

Hello,

This action was already suggested before. I have done it once again but it still stops after 01h and 15 min

Code: Select all

[root@AP-DCO163-NA flows]# ll
total 5268
-rw-r--r--+ 1 nna nnacmd 286459 Feb 16 09:50 nfcapd.201502160945
-rw-r--r--+ 1 nna nnacmd 292273 Feb 16 09:55 nfcapd.201502160950
-rw-r--r--+ 1 nna nnacmd 300371 Feb 16 10:00 nfcapd.201502160955
-rw-r--r--+ 1 nna nnacmd 366195 Feb 16 10:05 nfcapd.201502161000
-rw-r--r--+ 1 nna nnacmd 387092 Feb 16 10:10 nfcapd.201502161005
-rw-r--r--+ 1 nna nnacmd 289785 Feb 16 10:15 nfcapd.201502161010
-rw-r--r--+ 1 nna nnacmd 418950 Feb 16 10:20 nfcapd.201502161015
-rw-r--r--+ 1 nna nnacmd 394514 Feb 16 10:25 nfcapd.201502161020
-rw-r--r--+ 1 nna nnacmd 361279 Feb 16 10:30 nfcapd.201502161025
-rw-r--r--+ 1 nna nnacmd 349665 Feb 16 10:35 nfcapd.201502161030
-rw-r--r--+ 1 nna nnacmd 330966 Feb 16 10:40 nfcapd.201502161035
-rw-r--r--+ 1 nna nnacmd 302311 Feb 16 10:45 nfcapd.201502161040
-rw-r--r--+ 1 nna nnacmd 339341 Feb 16 10:50 nfcapd.201502161045
-rw-r--r--+ 1 nna nnacmd 310655 Feb 16 10:55 nfcapd.201502161050
-rw-r--r--+ 1 nna nnacmd 298446 Feb 16 11:00 nfcapd.201502161055
-rw-r--r--+ 1 nna nnacmd 332318 Feb 16 11:05 nfcapd.201502161100
-rw-r--r--+ 1 nna nnacmd    276 Feb 16 11:05 nfcapd.current.17746
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: sflow Source stops working

Post by scottwilkerson »

How much memory does this machine have?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
nagios-retail
Posts: 36
Joined: Mon Feb 09, 2015 3:32 am

Re: sflow Source stops working

Post by nagios-retail »

Code: Select all

[root@AP-DCO163-NA ~]# free
             total       used       free     shared    buffers     cached
Mem:       1019672     610432     409240        148      78196     270956
-/+ buffers/cache:     261280     758392
Swap:       262136      13760     248376
best regards

Koen Beuselinck
cmerchant
Posts: 546
Joined: Wed Sep 24, 2014 11:19 am

Re: sflow Source stops working

Post by cmerchant »

I did check and sfcapd has another thread that described a similar general protection fault. It mentions sfcapd failing when attempting to reuse the same port.
from this post:
The odd thing now is that I can start on a different port with dgb running, and it will load with minimal options. If I close the program, and re-run it on the same port, I get the crash.
here is the post: http://support.nagios.com/forum/viewtop ... 29&t=25444

I need to brush up on the linux debugger - gdb to replicate this.
DennisPR
Posts: 149
Joined: Mon May 07, 2012 10:34 am

Re: sflow Source stops working

Post by DennisPR »

Any update on this pls ? Or do we need to launch a support ticket ?
cmerchant
Posts: 546
Joined: Wed Sep 24, 2014 11:19 am

Re: sflow Source stops working

Post by cmerchant »

General protection faults are difficult to determine the cause of, and are sometimes hardware related errors, especially if they are intermittent. Does your NNA server run on a dedicated box, or is this shared on a VM server? Can you run a physical memory check against the server? (memtest86+)
nagios-retail
Posts: 36
Joined: Mon Feb 09, 2015 3:32 am

Re: sflow Source stops working

Post by nagios-retail »

Hello,

The NNA server is a VM running in a vSphere 5 environment.

best regards,
Koen Beuselinck
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: sflow Source stops working

Post by tgriep »

Another thing I found that could cause it is running out of memory. Could you add more to the server and try setting up the source again and see if it works for you?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked