Network Analyzer Slow

This support forum board is for support questions relating to Nagios Network Analyzer, our network traffic and bandwidth analysis solution.
Locked
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Network Analyzer Slow

Post by CFT6Server »

Network Analyzer seems to be really sluggish. When I click on sources, it takes a very long time before the chart shows up. It then takes even longer for the Top 5 Talkers to show.
We also have NA integrated into XI, and that seems to be sluggish as well. I would click on a host and then click on the Network Traffic Analysis tab. It will stay there spinning.

Here's what the NA data looks like.
sources.jpg
I was doing some testing with the last two sources, which isn't too big. NA is running version 2R1.0.
You do not have the required permissions to view the files attached to this post.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Network Analyzer Slow

Post by jolson »

If you run 'top' while the sluggishness is going on, what process is taking up the most CPU time? I assume it's a bunch of httpd threads? Could you show us the top output during that time?

Does your server have an appropriate amount of resources?

Code: Select all

free -m
df -h
top | head -n5
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Re: Network Analyzer Slow

Post by CFT6Server »

Here's what top looks like
sources_top.jpg
Also when I click on Sources, it sits there, and looks like if I click on something else to Navigate away, it will sit and wait until the first request is done. (ie. clicking on sources show summary, then I click on Reports and I have to wait)

Code: Select all

# free -m
             total       used       free     shared    buffers     cached
Mem:          3961       3815        145          0         16       3404
-/+ buffers/cache:        394       3567
Swap:          255          2        253

Code: Select all

]# df -h
Filesystem      Size  Used Avail Use% Mounted on
rootfs           50G   27G   22G  56% /
devtmpfs        2.0G  148K  2.0G   1% /dev
tmpfs           2.0G     0  2.0G   0% /dev/shm
/dev/sda1        50G   27G   22G  56% /

Code: Select all

top - 12:17:41 up 93 days, 23:59,  3 users,  load average: 1.00, 1.00, 0.76
Tasks: 156 total,   1 running, 155 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.3%us,  0.1%sy,  0.0%ni, 99.5%id,  0.1%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4056704k total,  3906484k used,   150220k free,     2692k buffers
Swap:   262136k total,     2452k used,   259684k free,  3520752k cached
Here's me clicking on the Network Traffic tab in XI
sources_top2.jpg
VM right now has 4vCPU and 4GB RAM
You do not have the required permissions to view the files attached to this post.
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Re: Network Analyzer Slow

Post by CFT6Server »

Here's another example... since I was watching top.
top3 high.jpg
You do not have the required permissions to view the files attached to this post.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Network Analyzer Slow

Post by jolson »

Could you show us your apache error logs and access logs as well?

Code: Select all

tail -f /var/log/httpd/error_log
tail -f /var/log/httpd/access_log
It looks like your RAM is being stressed - any chance you could bump it up to 8GB? I want to ensure that the kernel doesn't reap any nfdump processes. You can take a look at your system log to ensure that has not happened:

Code: Select all

cat /var/log/messages
How long did you set your retention window? At a certain point, the amount of data retained will slow down any operation that requires parsing the historical data. You may want to make use of 'Views' to narrow the scope of the information you see.

Retention settings are defined when you create a source:
2015-04-30 14_35_33-Create Source • Nagios Network Analyzer.png
You do not have the required permissions to view the files attached to this post.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Re: Network Analyzer Slow

Post by CFT6Server »

The source we I am clicking on was just added yesterday so there's less than 24hours worth of data. I was hoping to retain a month of data as sometimes we won't be able to notice any issues until a week or so, and if the retention is set too low, we won't be able to go back and check historical data.

I am monitoring all 3 logs (access, error and messages) and it doesn't move when I click on the source. While it is loading there isn't any activity.

Here's the messages log, other logs have no activity.

Clicking in NA into a source

Code: Select all

Apr 30 13:40:00 nagiosna nfcapd[7133]: Ident: '5' Flows: 2105739, Packets: 187169867, Bytes: 103502253460, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:40:00 nagiosna nfcapd[7133]: Signal launcher
Apr 30 13:40:00 nagiosna nfcapd[7133]: Total ignored packets: 0
Apr 30 13:40:00 nagiosna nfcapd[7165]: Ident: '7' Flows: 7661, Packets: 47834141, Bytes: 54853756737, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:40:00 nagiosna nfcapd[7165]: Signal launcher
Apr 30 13:40:00 nagiosna nfcapd[7165]: Total ignored packets: 0
Apr 30 13:40:00 nagiosna nfcapd[7166]: Run expire on '/usr/local/nagiosna/var/ZGW-INT-B01/flows'
Apr 30 13:40:00 nagiosna nfcapd[7166]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:40:00 nagiosna nfcapd[7166]: Current size: 30658560 = 29.2 MB, Current lifetime: 761400 = 1.3 weeks, Number of files: 2539
Apr 30 13:40:00 nagiosna nfcapd[7166]: expire completed - nothing to expire.
Apr 30 13:40:00 nagiosna nfcapd[7166]: laucher child exit 1 childs.
Apr 30 13:40:00 nagiosna nfcapd[7166]: laucher waiting childs done. 0 childs
Apr 30 13:40:00 nagiosna nfcapd[7093]: Ident: '3' Flows: 177580, Packets: 101594312, Bytes: 88765928080, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:40:00 nagiosna nfcapd[7093]: Signal launcher
Apr 30 13:40:00 nagiosna nfcapd[7093]: Total ignored packets: 0
Apr 30 13:40:00 nagiosna nfcapd[7094]: Run expire on '/usr/local/nagiosna/var/S3FP01N/flows'
Apr 30 13:40:00 nagiosna nfcapd[7094]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:40:00 nagiosna nfcapd[7094]: Current size: 19890323456 = 18.5 GB, Current lifetime: 2999400 = 5.0 weeks, Number of files: 9999
Apr 30 13:40:00 nagiosna nfcapd[7094]: expire completed - nothing to expire.
Apr 30 13:40:00 nagiosna nfcapd[7094]: laucher child exit 1 childs.
Apr 30 13:40:00 nagiosna nfcapd[7094]: laucher waiting childs done. 0 childs
Apr 30 13:40:01 nagiosna nfcapd[7134]: Run expire on '/usr/local/nagiosna/var/ZGW-INT-A01/flows'
Apr 30 13:40:01 nagiosna nfcapd[7134]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:40:01 nagiosna nfcapd[7134]: Current size: 5080039424 = 4.7 GB, Current lifetime: 763800 = 1.3 weeks, Number of files: 2547
Apr 30 13:40:01 nagiosna nfcapd[7134]: expire completed - nothing to expire.
Apr 30 13:40:01 nagiosna nfcapd[7101]: Ident: '4' Flows: 27243, Packets: 14281876, Bytes: 19197325545, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:40:01 nagiosna nfcapd[7101]: Signal launcher
Apr 30 13:40:01 nagiosna nfcapd[7101]: Total ignored packets: 0
Apr 30 13:40:01 nagiosna nfcapd[7134]: laucher child exit 1 childs.
Apr 30 13:40:01 nagiosna nfcapd[7134]: laucher waiting childs done. 0 childs
Apr 30 13:40:01 nagiosna nfcapd[7102]: Run expire on '/usr/local/nagiosna/var/S3FP02N/flows'
Apr 30 13:40:01 nagiosna nfcapd[7102]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:40:01 nagiosna nfcapd[7102]: Current size: 2302107648 = 2.1 GB, Current lifetime: 2999400 = 5.0 weeks, Number of files: 9999
Apr 30 13:40:01 nagiosna nfcapd[7102]: expire completed - nothing to expire.
Apr 30 13:40:01 nagiosna nfcapd[7102]: laucher child exit 1 childs.
Apr 30 13:40:01 nagiosna nfcapd[7102]: laucher waiting childs done. 0 childs
Apr 30 13:40:10 nagiosna nfcapd[7085]: Ident: '2' Flows: 0, Packets: 0, Bytes: 0, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:40:10 nagiosna nfcapd[7085]: Signal launcher
Apr 30 13:40:10 nagiosna nfcapd[7085]: Total ignored packets: 0
Apr 30 13:40:10 nagiosna nfcapd[7086]: Run expire on '/usr/local/nagiosna/var/S3FP02/flows'
Apr 30 13:40:10 nagiosna nfcapd[7086]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:40:10 nagiosna nfcapd[7086]: Current size: 40763392 = 38.9 MB, Current lifetime: 2985300 = 4.9 weeks, Number of files: 9952
Apr 30 13:40:10 nagiosna nfcapd[7086]: expire completed - nothing to expire.
Apr 30 13:40:10 nagiosna nfcapd[7086]: laucher child exit 1 childs.
Apr 30 13:40:10 nagiosna nfcapd[7086]: laucher waiting childs done. 0 childs
Apr 30 13:40:10 nagiosna nfcapd[7197]: Ident: '8' Flows: 0, Packets: 0, Bytes: 0, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:40:10 nagiosna nfcapd[7197]: Signal launcher
Apr 30 13:40:10 nagiosna nfcapd[7197]: Total ignored packets: 0
Apr 30 13:40:10 nagiosna nfcapd[7198]: Run expire on '/usr/local/nagiosna/var/KIDC-VMware-RTS/flows'
Apr 30 13:40:10 nagiosna nfcapd[7198]: Limits: Filesize <none>, Lifetime 172800 = 2.0 days, Watermark: 95%
Apr 30 13:40:10 nagiosna nfcapd[7198]: Current size: 1286144 = 1.2 MB, Current lifetime: 93900 = 1.1 days, Number of files: 314
Apr 30 13:40:10 nagiosna nfcapd[7198]: expire completed - nothing to expire.
Apr 30 13:40:10 nagiosna nfcapd[7198]: laucher child exit 1 childs.
Apr 30 13:40:10 nagiosna nfcapd[7198]: laucher waiting childs done. 0 childs
Apr 30 13:40:10 nagiosna nfcapd[7053]: Ident: '1' Flows: 0, Packets: 0, Bytes: 0, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:40:10 nagiosna nfcapd[7053]: Signal launcher
Apr 30 13:40:10 nagiosna nfcapd[7053]: Total ignored packets: 0
Apr 30 13:40:11 nagiosna nfcapd[7054]: Run expire on '/usr/local/nagiosna/var/S3FP01/flows'
Apr 30 13:40:11 nagiosna nfcapd[7054]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:40:11 nagiosna nfcapd[7054]: Current size: 40763392 = 38.9 MB, Current lifetime: 2985300 = 4.9 weeks, Number of files: 9952
Apr 30 13:40:11 nagiosna nfcapd[7054]: expire completed - nothing to expire.
Apr 30 13:40:11 nagiosna nfcapd[7054]: laucher child exit 1 childs.
Apr 30 13:40:11 nagiosna nfcapd[7054]: laucher waiting childs done. 0 childs
Clicking on Network traffic in XI

Code: Select all

Apr 30 13:45:00 nagiosna nfcapd[7133]: Ident: '5' Flows: 2098811, Packets: 177006184, Bytes: 106446785704, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:45:00 nagiosna nfcapd[7133]: Signal launcher
Apr 30 13:45:00 nagiosna nfcapd[7133]: Total ignored packets: 0
Apr 30 13:45:00 nagiosna nfcapd[7101]: Ident: '4' Flows: 28857, Packets: 3726163, Bytes: 3828671956, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:45:00 nagiosna nfcapd[7101]: Signal launcher
Apr 30 13:45:00 nagiosna nfcapd[7101]: Total ignored packets: 0
Apr 30 13:45:00 nagiosna nfcapd[7102]: Run expire on '/usr/local/nagiosna/var/S3FP02N/flows'
Apr 30 13:45:00 nagiosna nfcapd[7102]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:45:00 nagiosna nfcapd[7102]: Current size: 2302558208 = 2.1 GB, Current lifetime: 2999700 = 5.0 weeks, Number of files: 10000
Apr 30 13:45:00 nagiosna nfcapd[7102]: expire completed - nothing to expire.
Apr 30 13:45:00 nagiosna nfcapd[7102]: laucher child exit 1 childs.
Apr 30 13:45:00 nagiosna nfcapd[7102]: laucher waiting childs done. 0 childs
Apr 30 13:45:00 nagiosna nfcapd[7093]: Ident: '3' Flows: 196389, Packets: 86187379, Bytes: 63168076362, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:45:00 nagiosna nfcapd[7093]: Signal launcher
Apr 30 13:45:00 nagiosna nfcapd[7093]: Total ignored packets: 0
Apr 30 13:45:00 nagiosna nfcapd[7165]: Ident: '7' Flows: 8211, Packets: 27040319, Bytes: 38577369306, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:45:00 nagiosna nfcapd[7165]: Signal launcher
Apr 30 13:45:00 nagiosna nfcapd[7165]: Total ignored packets: 0
Apr 30 13:45:00 nagiosna nfcapd[7134]: Run expire on '/usr/local/nagiosna/var/ZGW-INT-A01/flows'
Apr 30 13:45:00 nagiosna nfcapd[7134]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:45:00 nagiosna nfcapd[7134]: Current size: 5115060224 = 4.8 GB, Current lifetime: 764100 = 1.3 weeks, Number of files: 2548
Apr 30 13:45:00 nagiosna nfcapd[7134]: expire completed - nothing to expire.
Apr 30 13:45:00 nagiosna nfcapd[7134]: laucher child exit 1 childs.
Apr 30 13:45:00 nagiosna nfcapd[7134]: laucher waiting childs done. 0 childs
Apr 30 13:45:00 nagiosna nfcapd[7166]: Run expire on '/usr/local/nagiosna/var/ZGW-INT-B01/flows'
Apr 30 13:45:00 nagiosna nfcapd[7166]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:45:00 nagiosna nfcapd[7166]: Current size: 30797824 = 29.4 MB, Current lifetime: 761700 = 1.3 weeks, Number of files: 2540
Apr 30 13:45:00 nagiosna nfcapd[7166]: expire completed - nothing to expire.
Apr 30 13:45:00 nagiosna nfcapd[7166]: laucher child exit 1 childs.
Apr 30 13:45:00 nagiosna nfcapd[7166]: laucher waiting childs done. 0 childs
Apr 30 13:45:01 nagiosna nfcapd[7094]: Run expire on '/usr/local/nagiosna/var/S3FP01N/flows'
Apr 30 13:45:01 nagiosna nfcapd[7094]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:45:01 nagiosna nfcapd[7094]: Current size: 19893346304 = 18.5 GB, Current lifetime: 2999700 = 5.0 weeks, Number of files: 10000
Apr 30 13:45:01 nagiosna nfcapd[7094]: expire completed - nothing to expire.
Apr 30 13:45:01 nagiosna nfcapd[7094]: laucher child exit 1 childs.
Apr 30 13:45:01 nagiosna nfcapd[7094]: laucher waiting childs done. 0 childs
Apr 30 13:45:10 nagiosna nfcapd[7085]: Ident: '2' Flows: 0, Packets: 0, Bytes: 0, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:45:10 nagiosna nfcapd[7085]: Signal launcher
Apr 30 13:45:10 nagiosna nfcapd[7085]: Total ignored packets: 0
Apr 30 13:45:10 nagiosna nfcapd[7086]: Run expire on '/usr/local/nagiosna/var/S3FP02/flows'
Apr 30 13:45:10 nagiosna nfcapd[7086]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:45:10 nagiosna nfcapd[7086]: Current size: 40767488 = 38.9 MB, Current lifetime: 2985600 = 4.9 weeks, Number of files: 9953
Apr 30 13:45:10 nagiosna nfcapd[7086]: expire completed - nothing to expire.
Apr 30 13:45:10 nagiosna nfcapd[7086]: laucher child exit 1 childs.
Apr 30 13:45:10 nagiosna nfcapd[7086]: laucher waiting childs done. 0 childs
Apr 30 13:45:10 nagiosna nfcapd[7197]: Ident: '8' Flows: 0, Packets: 0, Bytes: 0, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:45:10 nagiosna nfcapd[7197]: Signal launcher
Apr 30 13:45:10 nagiosna nfcapd[7197]: Total ignored packets: 0
Apr 30 13:45:10 nagiosna nfcapd[7198]: Run expire on '/usr/local/nagiosna/var/KIDC-VMware-RTS/flows'
Apr 30 13:45:10 nagiosna nfcapd[7198]: Limits: Filesize <none>, Lifetime 172800 = 2.0 days, Watermark: 95%
Apr 30 13:45:10 nagiosna nfcapd[7198]: Current size: 1290240 = 1.2 MB, Current lifetime: 94200 = 1.1 days, Number of files: 315
Apr 30 13:45:10 nagiosna nfcapd[7198]: expire completed - nothing to expire.
Apr 30 13:45:10 nagiosna nfcapd[7198]: laucher child exit 1 childs.
Apr 30 13:45:10 nagiosna nfcapd[7198]: laucher waiting childs done. 0 childs
Apr 30 13:45:10 nagiosna nfcapd[7053]: Ident: '1' Flows: 0, Packets: 0, Bytes: 0, Sequence Errors: 0, Bad Packets: 0
Apr 30 13:45:10 nagiosna nfcapd[7053]: Signal launcher
Apr 30 13:45:10 nagiosna nfcapd[7053]: Total ignored packets: 0
Apr 30 13:45:11 nagiosna nfcapd[7054]: Run expire on '/usr/local/nagiosna/var/S3FP01/flows'
Apr 30 13:45:11 nagiosna nfcapd[7054]: Limits: Filesize <none>, Lifetime 3024000 = 5.0 weeks, Watermark: 95%
Apr 30 13:45:11 nagiosna nfcapd[7054]: Current size: 40767488 = 38.9 MB, Current lifetime: 2985600 = 4.9 weeks, Number of files: 9953
Apr 30 13:45:11 nagiosna nfcapd[7054]: expire completed - nothing to expire.
Apr 30 13:45:11 nagiosna nfcapd[7054]: laucher child exit 1 childs.
Apr 30 13:45:11 nagiosna nfcapd[7054]: laucher waiting childs done. 0 childs
I can bump the NA to 8GB of RAM to test. We have only 4 sources total collecting data which doesn't seem like a lot. But could also be due the the volume I guess. But even navigating the pages within NA is painfully slow. I do want to mention that this server does not have any access to the internet. I wonder if there is anything that's trying to talk with the outside world? (just a wild stab)
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Re: Network Analyzer Slow

Post by CFT6Server »

Added the RAM and will keep an eye... so far it is not fully consumed but behavior is the same.
added RAM.JPG
You do not have the required permissions to view the files attached to this post.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Network Analyzer Slow

Post by jolson »

I do want to mention that this server does not have any access to the internet. I wonder if there is anything that's trying to talk with the outside world? (just a wild stab)
I think that Nagios NA is set up to resolve hostnames by default. Does the Web GUI get any 'snappier' when you turn DNS resolution off?
2015-04-30 16_40_44-Global Default Settings • Nagios Network Analyzer.png
When you log in, try not clicking the 'Sources' tab. Are the other tabs all quick to respond? If you click Sources, do the other tabs slow down?

My guess at this point is DNS resolution. Let me know if that is the case.
You do not have the required permissions to view the files attached to this post.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Re: Network Analyzer Slow

Post by CFT6Server »

Looks like ours is not checked. However I just recently upgraded NA, since we have the old 1.9 OVA deployed. Looks like after upgrade, the options needs to be "reset". The options in global settings were blank. I enabled DNS resolution and then disabled it, then restarted the server.

Second observation. I clicked around the interface and as long I don't click onto any sources. The interface is snappy. As soon as I click onto a source, it is pretty slow and I cannot get anywhere until that request is finished. So even trying to open another tab will pause until the source page is fully loaded. ( I am seeing that nfdump is running during the wait )

Lastly, I clicked into a smaller sample (which is about 33MB) and it is pretty snappy. However, switching over to the other sources, which is the core switch for the network, it does take a long time to load. Currently this source is sitting at 5.3GB. In your environments, what are you data sample sizes? When do you start to see performance degradation? Perhaps I just need to tune up the resources even more? However, I am not seeing nfdump using more than 1 core.

I tested the responding on a 2.2GB sized data source and it seems to be responding OK.

After a couple clicks and test, the memory usage has consumed most of the 8GB I've given it.
top.JPG

Code: Select all

# free -m
             total       used       free     shared    buffers     cached
Mem:          8001       7079        921          0         25       6627
-/+ buffers/cache:        426       7575
Swap:          255          0        255
You do not have the required permissions to view the files attached to this post.
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Re: Network Analyzer Slow

Post by CFT6Server »

There's also an interesting message I get...

when looking at "last week"
This report runs on raw data. Your current start time (-1 week) is longer than your raw data lifetime (-5w). You will only be seeing the last 5w of data.
when looking at "last 24hours"
This report runs on raw data. Your current start time (-24 hours) is longer than your raw data lifetime (-5w).
Locked