Network Analyzer Slow
-
CFT6Server
- Posts: 506
- Joined: Wed Apr 15, 2015 4:21 pm
Network Analyzer Slow
Network Analyzer seems to be really sluggish. When I click on sources, it takes a very long time before the chart shows up. It then takes even longer for the Top 5 Talkers to show.
We also have NA integrated into XI, and that seems to be sluggish as well. I would click on a host and then click on the Network Traffic Analysis tab. It will stay there spinning.
Here's what the NA data looks like. I was doing some testing with the last two sources, which isn't too big. NA is running version 2R1.0.
We also have NA integrated into XI, and that seems to be sluggish as well. I would click on a host and then click on the Network Traffic Analysis tab. It will stay there spinning.
Here's what the NA data looks like. I was doing some testing with the last two sources, which isn't too big. NA is running version 2R1.0.
You do not have the required permissions to view the files attached to this post.
Re: Network Analyzer Slow
If you run 'top' while the sluggishness is going on, what process is taking up the most CPU time? I assume it's a bunch of httpd threads? Could you show us the top output during that time?
Does your server have an appropriate amount of resources?
Does your server have an appropriate amount of resources?
Code: Select all
free -m
df -h
top | head -n5-
CFT6Server
- Posts: 506
- Joined: Wed Apr 15, 2015 4:21 pm
Re: Network Analyzer Slow
Here's what top looks like
Also when I click on Sources, it sits there, and looks like if I click on something else to Navigate away, it will sit and wait until the first request is done. (ie. clicking on sources show summary, then I click on Reports and I have to wait)
Here's me clicking on the Network Traffic tab in XI
VM right now has 4vCPU and 4GB RAM
Code: Select all
# free -m
total used free shared buffers cached
Mem: 3961 3815 145 0 16 3404
-/+ buffers/cache: 394 3567
Swap: 255 2 253
Code: Select all
]# df -h
Filesystem Size Used Avail Use% Mounted on
rootfs 50G 27G 22G 56% /
devtmpfs 2.0G 148K 2.0G 1% /dev
tmpfs 2.0G 0 2.0G 0% /dev/shm
/dev/sda1 50G 27G 22G 56% /
Code: Select all
top - 12:17:41 up 93 days, 23:59, 3 users, load average: 1.00, 1.00, 0.76
Tasks: 156 total, 1 running, 155 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.3%us, 0.1%sy, 0.0%ni, 99.5%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 4056704k total, 3906484k used, 150220k free, 2692k buffers
Swap: 262136k total, 2452k used, 259684k free, 3520752k cachedYou do not have the required permissions to view the files attached to this post.
-
CFT6Server
- Posts: 506
- Joined: Wed Apr 15, 2015 4:21 pm
Re: Network Analyzer Slow
Here's another example... since I was watching top.
You do not have the required permissions to view the files attached to this post.
Re: Network Analyzer Slow
Could you show us your apache error logs and access logs as well?
It looks like your RAM is being stressed - any chance you could bump it up to 8GB? I want to ensure that the kernel doesn't reap any nfdump processes. You can take a look at your system log to ensure that has not happened:
How long did you set your retention window? At a certain point, the amount of data retained will slow down any operation that requires parsing the historical data. You may want to make use of 'Views' to narrow the scope of the information you see.
Retention settings are defined when you create a source:
Code: Select all
tail -f /var/log/httpd/error_log
tail -f /var/log/httpd/access_logCode: Select all
cat /var/log/messagesRetention settings are defined when you create a source:
You do not have the required permissions to view the files attached to this post.
-
CFT6Server
- Posts: 506
- Joined: Wed Apr 15, 2015 4:21 pm
Re: Network Analyzer Slow
The source we I am clicking on was just added yesterday so there's less than 24hours worth of data. I was hoping to retain a month of data as sometimes we won't be able to notice any issues until a week or so, and if the retention is set too low, we won't be able to go back and check historical data.
I am monitoring all 3 logs (access, error and messages) and it doesn't move when I click on the source. While it is loading there isn't any activity.
Here's the messages log, other logs have no activity.
Clicking in NA into a source
Clicking on Network traffic in XI
I can bump the NA to 8GB of RAM to test. We have only 4 sources total collecting data which doesn't seem like a lot. But could also be due the the volume I guess. But even navigating the pages within NA is painfully slow. I do want to mention that this server does not have any access to the internet. I wonder if there is anything that's trying to talk with the outside world? (just a wild stab)
I am monitoring all 3 logs (access, error and messages) and it doesn't move when I click on the source. While it is loading there isn't any activity.
Here's the messages log, other logs have no activity.
Clicking in NA into a source
Code: Select all
Apr 30 13:40:00 nagiosna nfcapd[7133]: Ident: '5' Flows: 2105739, Packets: 187169867, Bytes: 103502253460, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:40:00 nagiosna nfcapd[7133]: Signal launcher
Apr 30 13:40:00 nagiosna nfcapd[7133]: Total ignored packets: 0
Apr 30 13:40:00 nagiosna nfcapd[7165]: Ident: '7' Flows: 7661, Packets: 47834141, Bytes: 54853756737, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:40:00 nagiosna nfcapd[7165]: Signal launcher
Apr 30 13:40:00 nagiosna nfcapd[7165]: Total ignored packets: 0
Apr 30 13:40:00 nagiosna nfcapd[7166]: Run expire on '/usr/local/nagiosna/var/ZGW-INT-B01/flows'
Apr 30 13:40:00 nagiosna nfcapd[7166]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:40:00 nagiosna nfcapd[7166]: Current size: 30658560 = 29.2 MB, Current lifetime: 761400 = 1.3 weeks, Number of files: 2539
Apr 30 13:40:00 nagiosna nfcapd[7166]: expire completed - nothing to expire.
Apr 30 13:40:00 nagiosna nfcapd[7166]: laucher child exit 1 childs.
Apr 30 13:40:00 nagiosna nfcapd[7166]: laucher waiting childs done. 0 childs
Apr 30 13:40:00 nagiosna nfcapd[7093]: Ident: '3' Flows: 177580, Packets: 101594312, Bytes: 88765928080, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:40:00 nagiosna nfcapd[7093]: Signal launcher
Apr 30 13:40:00 nagiosna nfcapd[7093]: Total ignored packets: 0
Apr 30 13:40:00 nagiosna nfcapd[7094]: Run expire on '/usr/local/nagiosna/var/S3FP01N/flows'
Apr 30 13:40:00 nagiosna nfcapd[7094]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:40:00 nagiosna nfcapd[7094]: Current size: 19890323456 = 18.5 GB, Current lifetime: 2999400 = 5.0 weeks, Number of files: 9999
Apr 30 13:40:00 nagiosna nfcapd[7094]: expire completed - nothing to expire.
Apr 30 13:40:00 nagiosna nfcapd[7094]: laucher child exit 1 childs.
Apr 30 13:40:00 nagiosna nfcapd[7094]: laucher waiting childs done. 0 childs
Apr 30 13:40:01 nagiosna nfcapd[7134]: Run expire on '/usr/local/nagiosna/var/ZGW-INT-A01/flows'
Apr 30 13:40:01 nagiosna nfcapd[7134]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:40:01 nagiosna nfcapd[7134]: Current size: 5080039424 = 4.7 GB, Current lifetime: 763800 = 1.3 weeks, Number of files: 2547
Apr 30 13:40:01 nagiosna nfcapd[7134]: expire completed - nothing to expire.
Apr 30 13:40:01 nagiosna nfcapd[7101]: Ident: '4' Flows: 27243, Packets: 14281876, Bytes: 19197325545, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:40:01 nagiosna nfcapd[7101]: Signal launcher
Apr 30 13:40:01 nagiosna nfcapd[7101]: Total ignored packets: 0
Apr 30 13:40:01 nagiosna nfcapd[7134]: laucher child exit 1 childs.
Apr 30 13:40:01 nagiosna nfcapd[7134]: laucher waiting childs done. 0 childs
Apr 30 13:40:01 nagiosna nfcapd[7102]: Run expire on '/usr/local/nagiosna/var/S3FP02N/flows'
Apr 30 13:40:01 nagiosna nfcapd[7102]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:40:01 nagiosna nfcapd[7102]: Current size: 2302107648 = 2.1 GB, Current lifetime: 2999400 = 5.0 weeks, Number of files: 9999
Apr 30 13:40:01 nagiosna nfcapd[7102]: expire completed - nothing to expire.
Apr 30 13:40:01 nagiosna nfcapd[7102]: laucher child exit 1 childs.
Apr 30 13:40:01 nagiosna nfcapd[7102]: laucher waiting childs done. 0 childs
Apr 30 13:40:10 nagiosna nfcapd[7085]: Ident: '2' Flows: 0, Packets: 0, Bytes: 0, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:40:10 nagiosna nfcapd[7085]: Signal launcher
Apr 30 13:40:10 nagiosna nfcapd[7085]: Total ignored packets: 0
Apr 30 13:40:10 nagiosna nfcapd[7086]: Run expire on '/usr/local/nagiosna/var/S3FP02/flows'
Apr 30 13:40:10 nagiosna nfcapd[7086]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:40:10 nagiosna nfcapd[7086]: Current size: 40763392 = 38.9 MB, Current lifetime: 2985300 = 4.9 weeks, Number of files: 9952
Apr 30 13:40:10 nagiosna nfcapd[7086]: expire completed - nothing to expire.
Apr 30 13:40:10 nagiosna nfcapd[7086]: laucher child exit 1 childs.
Apr 30 13:40:10 nagiosna nfcapd[7086]: laucher waiting childs done. 0 childs
Apr 30 13:40:10 nagiosna nfcapd[7197]: Ident: '8' Flows: 0, Packets: 0, Bytes: 0, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:40:10 nagiosna nfcapd[7197]: Signal launcher
Apr 30 13:40:10 nagiosna nfcapd[7197]: Total ignored packets: 0
Apr 30 13:40:10 nagiosna nfcapd[7198]: Run expire on '/usr/local/nagiosna/var/KIDC-VMware-RTS/flows'
Apr 30 13:40:10 nagiosna nfcapd[7198]: Limits: Filesize <none>, Lifetime 172800 = 2.0 days, Watermark: 95%
Apr 30 13:40:10 nagiosna nfcapd[7198]: Current size: 1286144 = 1.2 MB, Current lifetime: 93900 = 1.1 days, Number of files: 314
Apr 30 13:40:10 nagiosna nfcapd[7198]: expire completed - nothing to expire.
Apr 30 13:40:10 nagiosna nfcapd[7198]: laucher child exit 1 childs.
Apr 30 13:40:10 nagiosna nfcapd[7198]: laucher waiting childs done. 0 childs
Apr 30 13:40:10 nagiosna nfcapd[7053]: Ident: '1' Flows: 0, Packets: 0, Bytes: 0, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:40:10 nagiosna nfcapd[7053]: Signal launcher
Apr 30 13:40:10 nagiosna nfcapd[7053]: Total ignored packets: 0
Apr 30 13:40:11 nagiosna nfcapd[7054]: Run expire on '/usr/local/nagiosna/var/S3FP01/flows'
Apr 30 13:40:11 nagiosna nfcapd[7054]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:40:11 nagiosna nfcapd[7054]: Current size: 40763392 = 38.9 MB, Current lifetime: 2985300 = 4.9 weeks, Number of files: 9952
Apr 30 13:40:11 nagiosna nfcapd[7054]: expire completed - nothing to expire.
Apr 30 13:40:11 nagiosna nfcapd[7054]: laucher child exit 1 childs.
Apr 30 13:40:11 nagiosna nfcapd[7054]: laucher waiting childs done. 0 childs
Code: Select all
Apr 30 13:45:00 nagiosna nfcapd[7133]: Ident: '5' Flows: 2098811, Packets: 177006184, Bytes: 106446785704, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:45:00 nagiosna nfcapd[7133]: Signal launcher
Apr 30 13:45:00 nagiosna nfcapd[7133]: Total ignored packets: 0
Apr 30 13:45:00 nagiosna nfcapd[7101]: Ident: '4' Flows: 28857, Packets: 3726163, Bytes: 3828671956, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:45:00 nagiosna nfcapd[7101]: Signal launcher
Apr 30 13:45:00 nagiosna nfcapd[7101]: Total ignored packets: 0
Apr 30 13:45:00 nagiosna nfcapd[7102]: Run expire on '/usr/local/nagiosna/var/S3FP02N/flows'
Apr 30 13:45:00 nagiosna nfcapd[7102]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:45:00 nagiosna nfcapd[7102]: Current size: 2302558208 = 2.1 GB, Current lifetime: 2999700 = 5.0 weeks, Number of files: 10000
Apr 30 13:45:00 nagiosna nfcapd[7102]: expire completed - nothing to expire.
Apr 30 13:45:00 nagiosna nfcapd[7102]: laucher child exit 1 childs.
Apr 30 13:45:00 nagiosna nfcapd[7102]: laucher waiting childs done. 0 childs
Apr 30 13:45:00 nagiosna nfcapd[7093]: Ident: '3' Flows: 196389, Packets: 86187379, Bytes: 63168076362, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:45:00 nagiosna nfcapd[7093]: Signal launcher
Apr 30 13:45:00 nagiosna nfcapd[7093]: Total ignored packets: 0
Apr 30 13:45:00 nagiosna nfcapd[7165]: Ident: '7' Flows: 8211, Packets: 27040319, Bytes: 38577369306, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:45:00 nagiosna nfcapd[7165]: Signal launcher
Apr 30 13:45:00 nagiosna nfcapd[7165]: Total ignored packets: 0
Apr 30 13:45:00 nagiosna nfcapd[7134]: Run expire on '/usr/local/nagiosna/var/ZGW-INT-A01/flows'
Apr 30 13:45:00 nagiosna nfcapd[7134]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:45:00 nagiosna nfcapd[7134]: Current size: 5115060224 = 4.8 GB, Current lifetime: 764100 = 1.3 weeks, Number of files: 2548
Apr 30 13:45:00 nagiosna nfcapd[7134]: expire completed - nothing to expire.
Apr 30 13:45:00 nagiosna nfcapd[7134]: laucher child exit 1 childs.
Apr 30 13:45:00 nagiosna nfcapd[7134]: laucher waiting childs done. 0 childs
Apr 30 13:45:00 nagiosna nfcapd[7166]: Run expire on '/usr/local/nagiosna/var/ZGW-INT-B01/flows'
Apr 30 13:45:00 nagiosna nfcapd[7166]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:45:00 nagiosna nfcapd[7166]: Current size: 30797824 = 29.4 MB, Current lifetime: 761700 = 1.3 weeks, Number of files: 2540
Apr 30 13:45:00 nagiosna nfcapd[7166]: expire completed - nothing to expire.
Apr 30 13:45:00 nagiosna nfcapd[7166]: laucher child exit 1 childs.
Apr 30 13:45:00 nagiosna nfcapd[7166]: laucher waiting childs done. 0 childs
Apr 30 13:45:01 nagiosna nfcapd[7094]: Run expire on '/usr/local/nagiosna/var/S3FP01N/flows'
Apr 30 13:45:01 nagiosna nfcapd[7094]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:45:01 nagiosna nfcapd[7094]: Current size: 19893346304 = 18.5 GB, Current lifetime: 2999700 = 5.0 weeks, Number of files: 10000
Apr 30 13:45:01 nagiosna nfcapd[7094]: expire completed - nothing to expire.
Apr 30 13:45:01 nagiosna nfcapd[7094]: laucher child exit 1 childs.
Apr 30 13:45:01 nagiosna nfcapd[7094]: laucher waiting childs done. 0 childs
Apr 30 13:45:10 nagiosna nfcapd[7085]: Ident: '2' Flows: 0, Packets: 0, Bytes: 0, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:45:10 nagiosna nfcapd[7085]: Signal launcher
Apr 30 13:45:10 nagiosna nfcapd[7085]: Total ignored packets: 0
Apr 30 13:45:10 nagiosna nfcapd[7086]: Run expire on '/usr/local/nagiosna/var/S3FP02/flows'
Apr 30 13:45:10 nagiosna nfcapd[7086]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:45:10 nagiosna nfcapd[7086]: Current size: 40767488 = 38.9 MB, Current lifetime: 2985600 = 4.9 weeks, Number of files: 9953
Apr 30 13:45:10 nagiosna nfcapd[7086]: expire completed - nothing to expire.
Apr 30 13:45:10 nagiosna nfcapd[7086]: laucher child exit 1 childs.
Apr 30 13:45:10 nagiosna nfcapd[7086]: laucher waiting childs done. 0 childs
Apr 30 13:45:10 nagiosna nfcapd[7197]: Ident: '8' Flows: 0, Packets: 0, Bytes: 0, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:45:10 nagiosna nfcapd[7197]: Signal launcher
Apr 30 13:45:10 nagiosna nfcapd[7197]: Total ignored packets: 0
Apr 30 13:45:10 nagiosna nfcapd[7198]: Run expire on '/usr/local/nagiosna/var/KIDC-VMware-RTS/flows'
Apr 30 13:45:10 nagiosna nfcapd[7198]: Limits: Filesize <none>, Lifetime 172800 = 2.0 days, Watermark: 95%
Apr 30 13:45:10 nagiosna nfcapd[7198]: Current size: 1290240 = 1.2 MB, Current lifetime: 94200 = 1.1 days, Number of files: 315
Apr 30 13:45:10 nagiosna nfcapd[7198]: expire completed - nothing to expire.
Apr 30 13:45:10 nagiosna nfcapd[7198]: laucher child exit 1 childs.
Apr 30 13:45:10 nagiosna nfcapd[7198]: laucher waiting childs done. 0 childs
Apr 30 13:45:10 nagiosna nfcapd[7053]: Ident: '1' Flows: 0, Packets: 0, Bytes: 0, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:45:10 nagiosna nfcapd[7053]: Signal launcher
Apr 30 13:45:10 nagiosna nfcapd[7053]: Total ignored packets: 0
Apr 30 13:45:11 nagiosna nfcapd[7054]: Run expire on '/usr/local/nagiosna/var/S3FP01/flows'
Apr 30 13:45:11 nagiosna nfcapd[7054]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:45:11 nagiosna nfcapd[7054]: Current size: 40767488 = 38.9 MB, Current lifetime: 2985600 = 4.9 weeks, Number of files: 9953
Apr 30 13:45:11 nagiosna nfcapd[7054]: expire completed - nothing to expire.
Apr 30 13:45:11 nagiosna nfcapd[7054]: laucher child exit 1 childs.
Apr 30 13:45:11 nagiosna nfcapd[7054]: laucher waiting childs done. 0 childs
-
CFT6Server
- Posts: 506
- Joined: Wed Apr 15, 2015 4:21 pm
Re: Network Analyzer Slow
Added the RAM and will keep an eye... so far it is not fully consumed but behavior is the same.
You do not have the required permissions to view the files attached to this post.
Re: Network Analyzer Slow
I think that Nagios NA is set up to resolve hostnames by default. Does the Web GUI get any 'snappier' when you turn DNS resolution off? When you log in, try not clicking the 'Sources' tab. Are the other tabs all quick to respond? If you click Sources, do the other tabs slow down?I do want to mention that this server does not have any access to the internet. I wonder if there is anything that's trying to talk with the outside world? (just a wild stab)
My guess at this point is DNS resolution. Let me know if that is the case.
You do not have the required permissions to view the files attached to this post.
-
CFT6Server
- Posts: 506
- Joined: Wed Apr 15, 2015 4:21 pm
Re: Network Analyzer Slow
Looks like ours is not checked. However I just recently upgraded NA, since we have the old 1.9 OVA deployed. Looks like after upgrade, the options needs to be "reset". The options in global settings were blank. I enabled DNS resolution and then disabled it, then restarted the server.
Second observation. I clicked around the interface and as long I don't click onto any sources. The interface is snappy. As soon as I click onto a source, it is pretty slow and I cannot get anywhere until that request is finished. So even trying to open another tab will pause until the source page is fully loaded. ( I am seeing that nfdump is running during the wait )
Lastly, I clicked into a smaller sample (which is about 33MB) and it is pretty snappy. However, switching over to the other sources, which is the core switch for the network, it does take a long time to load. Currently this source is sitting at 5.3GB. In your environments, what are you data sample sizes? When do you start to see performance degradation? Perhaps I just need to tune up the resources even more? However, I am not seeing nfdump using more than 1 core.
I tested the responding on a 2.2GB sized data source and it seems to be responding OK.
After a couple clicks and test, the memory usage has consumed most of the 8GB I've given it.
Second observation. I clicked around the interface and as long I don't click onto any sources. The interface is snappy. As soon as I click onto a source, it is pretty slow and I cannot get anywhere until that request is finished. So even trying to open another tab will pause until the source page is fully loaded. ( I am seeing that nfdump is running during the wait )
Lastly, I clicked into a smaller sample (which is about 33MB) and it is pretty snappy. However, switching over to the other sources, which is the core switch for the network, it does take a long time to load. Currently this source is sitting at 5.3GB. In your environments, what are you data sample sizes? When do you start to see performance degradation? Perhaps I just need to tune up the resources even more? However, I am not seeing nfdump using more than 1 core.
I tested the responding on a 2.2GB sized data source and it seems to be responding OK.
After a couple clicks and test, the memory usage has consumed most of the 8GB I've given it.
Code: Select all
# free -m
total used free shared buffers cached
Mem: 8001 7079 921 0 25 6627
-/+ buffers/cache: 426 7575
Swap: 255 0 255
You do not have the required permissions to view the files attached to this post.
-
CFT6Server
- Posts: 506
- Joined: Wed Apr 15, 2015 4:21 pm
Re: Network Analyzer Slow
There's also an interesting message I get...
when looking at "last week"
when looking at "last week"
when looking at "last 24hours"This report runs on raw data. Your current start time (-1 week) is longer than your raw data lifetime (-5w). You will only be seeing the last 5w of data.
This report runs on raw data. Your current start time (-24 hours) is longer than your raw data lifetime (-5w).