Network Analyzer Slow

This support forum board is for support questions relating to Nagios Network Analyzer, our network traffic and bandwidth analysis solution.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Network Analyzer Slow

Post by Box293 »

Can you please run this command and post back the output:

Code: Select all

du -h /usr/local/nagiosna/var/
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Re: Network Analyzer Slow

Post by CFT6Server »

Here it is

Code: Select all

# du -h /usr/local/nagiosna/var/
11G     /usr/local/nagiosna/var/INT-A01/flows
11G     /usr/local/nagiosna/var/INT-A01
39M     /usr/local/nagiosna/var/S3FP01/flows
40M     /usr/local/nagiosna/var/S3FP01
39M     /usr/local/nagiosna/var/S3FP02/flows
40M     /usr/local/nagiosna/var/S3FP02
19G     /usr/local/nagiosna/var/S3FP01N/flows
2.7M    /usr/local/nagiosna/var/S3FP01N/views/test
3.9M    /usr/local/nagiosna/var/S3FP01N/views
19G     /usr/local/nagiosna/var/S3FP01N
60M     /usr/local/nagiosna/var/INT-B01/flows
62M     /usr/local/nagiosna/var/INT-B01
2.2M    /usr/local/nagiosna/var/VMware-RTS/flows
3.4M    /usr/local/nagiosna/var/VMware-RTS
2.1G    /usr/local/nagiosna/var/S3FP02N/flows
2.1G    /usr/local/nagiosna/var/S3FP02N
32G     /usr/local/nagiosna/var/
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Network Analyzer Slow

Post by jolson »

I spoke with a developer, and the only performance gain that we came up with would be switching the server to use SSD's if it's not already - the delay comes due the fact that the files NNA needs to parse are gigabytes in size. You can definitely use 'Views' to narrow down the scope by ports/IPs you're interested in - I suggest toying with Views if you have not already.
2015-05-01 11_52_27-Views • Nagios Network Analyzer.png
You do not have the required permissions to view the files attached to this post.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Re: Network Analyzer Slow

Post by CFT6Server »

Unfortunately these are VMs so they are on a shared data store. I can look into getting these into a faster data store to see if it improves performance. I am interested to know the amount of flow data Nagios is getting in their test environments. Are we at a high level? This is only from a couple core switches, so the sources and data will only grow from here.

The XI integration which by defaults look for 24hours, so I am not sure if modifying the views will be much help on that front. Looking at the host from XI and clicking on network traffic analysis is taking too long and takes away from the benefit of having NNA integrated.

Not sure if this is possible, but since Nagios have the Log Server solution which is based on Elasticsearch, could NNA potentially use an Elasticsearch backend to store and query these netflow data? This would improve the performance drastically. Of course this is a suggestion made without really knowing how NNA works under the hood, and I could be off into space on this.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Network Analyzer Slow

Post by jolson »

We do not have a test environment at the scale of your production environment - but the performance degradation is certainly happening due to the large files we're seeing in /usr/local/nagiosna/var. These are flat files, and they need to be analyzed when Nagios Network Analyzer picks up information to display in the Web GUI.

You are correct in that an elasticsearch or similar backend would speed up the processing - or something that pre-caches the netflow data.

The only reasonable solution that I can come up with is to move to a faster storage medium so that the large flow files can be loaded more quickly - in the meantime I would be happy to put in a feature request for some sort of pre-caching backend.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Re: Network Analyzer Slow

Post by CFT6Server »

Ok Thanks. In this case, I will have to play around with configurations and the size and retention of the sources.
Please submit a feature request for pre-caching backend. (Could be a good plug or value add for Log Server customers if there's some sort of tie in)

Also not sure if this can be addressed as well. If the NNA is stuck processing, you will not be able to even navigate to another page. Is there a way to code it so that if the user navigates away, it kills the process and goes to the new page?
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Network Analyzer Slow

Post by jolson »

I have submitted two feature requests for you - one for the pre-caching backend, and another regarding increasing performance/responsiveness when navigating to other pages after a large source is selected.

I spoke with a developer about the latter feature and it looks like it is possible, but would require a lot of re-writing so it cannot be done quickly. It is something that we're aware of and working toward, for both XI and NNA.

Anything else I could help you with?
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Re: Network Analyzer Slow

Post by CFT6Server »

Thanks. I am trying to get some usefulness out of NNA by controlling our sample size. So I've limited the source to 2 days for testing and it is at about 6.6GB worth of data.

1. I've built some queries, but they are taking a long time (even if I specific say in the last 2 hours)
2. I see this message when running queries, and not sure why it is reporting this...
This report runs on raw data. Your current start time (-24 hours) is longer than your raw data lifetime (-2d). You will only be seeing the last 2d of data.
Could that be something to do with the queries, if it is still trying to read the data set?

I tested the queries using smaller sources up to about 3GB and it runs ok even at 24hours and pretty fast. The 6GB source just doesn't finish...
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Network Analyzer Slow

Post by jolson »

This report runs on raw data. Your current start time (-24 hours) is longer than your raw data lifetime (-2d). You will only be seeing the last 2d of data.
This is a feature that was added in version 2R1.0.
- Added warning text on queries/reports when the begin date is longer than the raw data lifetime -JO
http://assets.nagios.com/downloads/nagi ... HANGES.TXT
According to the developers, there may be a bug in the way that this data is calculated. Does that error seem off to you? (Do you have 2d of raw data)?
I've built some queries, but they are taking a long time (even if I specific say in the last 2 hours)
How long are these queries taking? My assumption tells me that it should take about twice as long to run as your 3GB reports - does it take much longer?
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Re: Network Analyzer Slow

Post by CFT6Server »

This is a feature that was added in version 2R1.0.
In terms of the error message, I believe the calculation is off. I am asking for 24 hours of data, but it is say that it is longer than my 2 days of data. My data is 2 days, but why is it saying 24hours is longer than 2days? So is it really trying to look at the entire data set? or just the calculation and message that's off?
How long are these queries taking? My assumption tells me that it should take about twice as long to run as your 3GB reports - does it take much longer?
Beyond the 3GB data queries, the 6GB data queries just never finishes. Not even twice as long. I started testing in 1 hour increments for my queries. So looking back only 1 hour, 2 hours, 3 hours. First hour works fine, but anything beyond that, and it just keeps spinning and doesn't return the data. Chord Diagram completes, but that's it.

Couple more testing scenarios for 6GB data set: (Also to give you an idea of the size of the data)
Running query: port 1433 aggregated by srcip,dstip - doesn't work past 3 hours (3 hours produces 1316 pages of results)
Running query: port 1433 aggregated by srcip - works for 24hours (produces 35pages of results)
Running query: port 1433 aggregated by scrip,dstip,scrport,dstport - only works for 1hour (1 hour produces 900 pages of results)

I don't think these are overly complicated queries. We have some other ones we build that are looking at multiple ports, increase the query times

This cannot handle flow data from a single core switch, and unfortunately NNA is not going to be useful if we cannot even keep at least 24 to 48 hours of data and query against it. I am not sure what to do at this point. This hinders the XI integration as well, since the it asks for src or dst ip for all source groups within NNA.
Locked