Page 2 of 8

Re: Network Analyzer Slow

Posted: Fri May 01, 2015 11:16 am
by Box293
Can you please run this command and post back the output:

Code: Select all

du -h /usr/local/nagiosna/var/

Re: Network Analyzer Slow

Posted: Fri May 01, 2015 11:39 am
by CFT6Server
Here it is

Code: Select all

# du -h /usr/local/nagiosna/var/
11G     /usr/local/nagiosna/var/INT-A01/flows
11G     /usr/local/nagiosna/var/INT-A01
39M     /usr/local/nagiosna/var/S3FP01/flows
40M     /usr/local/nagiosna/var/S3FP01
39M     /usr/local/nagiosna/var/S3FP02/flows
40M     /usr/local/nagiosna/var/S3FP02
19G     /usr/local/nagiosna/var/S3FP01N/flows
2.7M    /usr/local/nagiosna/var/S3FP01N/views/test
3.9M    /usr/local/nagiosna/var/S3FP01N/views
19G     /usr/local/nagiosna/var/S3FP01N
60M     /usr/local/nagiosna/var/INT-B01/flows
62M     /usr/local/nagiosna/var/INT-B01
2.2M    /usr/local/nagiosna/var/VMware-RTS/flows
3.4M    /usr/local/nagiosna/var/VMware-RTS
2.1G    /usr/local/nagiosna/var/S3FP02N/flows
2.1G    /usr/local/nagiosna/var/S3FP02N
32G     /usr/local/nagiosna/var/

Re: Network Analyzer Slow

Posted: Fri May 01, 2015 11:52 am
by jolson
I spoke with a developer, and the only performance gain that we came up with would be switching the server to use SSD's if it's not already - the delay comes due the fact that the files NNA needs to parse are gigabytes in size. You can definitely use 'Views' to narrow down the scope by ports/IPs you're interested in - I suggest toying with Views if you have not already.
2015-05-01 11_52_27-Views • Nagios Network Analyzer.png

Re: Network Analyzer Slow

Posted: Fri May 01, 2015 12:30 pm
by CFT6Server
Unfortunately these are VMs so they are on a shared data store. I can look into getting these into a faster data store to see if it improves performance. I am interested to know the amount of flow data Nagios is getting in their test environments. Are we at a high level? This is only from a couple core switches, so the sources and data will only grow from here.

The XI integration which by defaults look for 24hours, so I am not sure if modifying the views will be much help on that front. Looking at the host from XI and clicking on network traffic analysis is taking too long and takes away from the benefit of having NNA integrated.

Not sure if this is possible, but since Nagios have the Log Server solution which is based on Elasticsearch, could NNA potentially use an Elasticsearch backend to store and query these netflow data? This would improve the performance drastically. Of course this is a suggestion made without really knowing how NNA works under the hood, and I could be off into space on this.

Re: Network Analyzer Slow

Posted: Fri May 01, 2015 1:59 pm
by jolson
We do not have a test environment at the scale of your production environment - but the performance degradation is certainly happening due to the large files we're seeing in /usr/local/nagiosna/var. These are flat files, and they need to be analyzed when Nagios Network Analyzer picks up information to display in the Web GUI.

You are correct in that an elasticsearch or similar backend would speed up the processing - or something that pre-caches the netflow data.

The only reasonable solution that I can come up with is to move to a faster storage medium so that the large flow files can be loaded more quickly - in the meantime I would be happy to put in a feature request for some sort of pre-caching backend.

Re: Network Analyzer Slow

Posted: Mon May 04, 2015 12:10 pm
by CFT6Server
Ok Thanks. In this case, I will have to play around with configurations and the size and retention of the sources.
Please submit a feature request for pre-caching backend. (Could be a good plug or value add for Log Server customers if there's some sort of tie in)

Also not sure if this can be addressed as well. If the NNA is stuck processing, you will not be able to even navigate to another page. Is there a way to code it so that if the user navigates away, it kills the process and goes to the new page?

Re: Network Analyzer Slow

Posted: Mon May 04, 2015 1:57 pm
by jolson
I have submitted two feature requests for you - one for the pre-caching backend, and another regarding increasing performance/responsiveness when navigating to other pages after a large source is selected.

I spoke with a developer about the latter feature and it looks like it is possible, but would require a lot of re-writing so it cannot be done quickly. It is something that we're aware of and working toward, for both XI and NNA.

Anything else I could help you with?

Re: Network Analyzer Slow

Posted: Fri May 08, 2015 2:50 pm
by CFT6Server
Thanks. I am trying to get some usefulness out of NNA by controlling our sample size. So I've limited the source to 2 days for testing and it is at about 6.6GB worth of data.

1. I've built some queries, but they are taking a long time (even if I specific say in the last 2 hours)
2. I see this message when running queries, and not sure why it is reporting this...
This report runs on raw data. Your current start time (-24 hours) is longer than your raw data lifetime (-2d). You will only be seeing the last 2d of data.
Could that be something to do with the queries, if it is still trying to read the data set?

I tested the queries using smaller sources up to about 3GB and it runs ok even at 24hours and pretty fast. The 6GB source just doesn't finish...

Re: Network Analyzer Slow

Posted: Mon May 11, 2015 10:23 am
by jolson
This report runs on raw data. Your current start time (-24 hours) is longer than your raw data lifetime (-2d). You will only be seeing the last 2d of data.
This is a feature that was added in version 2R1.0.
- Added warning text on queries/reports when the begin date is longer than the raw data lifetime -JO
http://assets.nagios.com/downloads/nagi ... HANGES.TXT
According to the developers, there may be a bug in the way that this data is calculated. Does that error seem off to you? (Do you have 2d of raw data)?
I've built some queries, but they are taking a long time (even if I specific say in the last 2 hours)
How long are these queries taking? My assumption tells me that it should take about twice as long to run as your 3GB reports - does it take much longer?

Re: Network Analyzer Slow

Posted: Mon May 11, 2015 11:32 am
by CFT6Server
This is a feature that was added in version 2R1.0.
In terms of the error message, I believe the calculation is off. I am asking for 24 hours of data, but it is say that it is longer than my 2 days of data. My data is 2 days, but why is it saying 24hours is longer than 2days? So is it really trying to look at the entire data set? or just the calculation and message that's off?
How long are these queries taking? My assumption tells me that it should take about twice as long to run as your 3GB reports - does it take much longer?
Beyond the 3GB data queries, the 6GB data queries just never finishes. Not even twice as long. I started testing in 1 hour increments for my queries. So looking back only 1 hour, 2 hours, 3 hours. First hour works fine, but anything beyond that, and it just keeps spinning and doesn't return the data. Chord Diagram completes, but that's it.

Couple more testing scenarios for 6GB data set: (Also to give you an idea of the size of the data)
Running query: port 1433 aggregated by srcip,dstip - doesn't work past 3 hours (3 hours produces 1316 pages of results)
Running query: port 1433 aggregated by srcip - works for 24hours (produces 35pages of results)
Running query: port 1433 aggregated by scrip,dstip,scrport,dstport - only works for 1hour (1 hour produces 900 pages of results)

I don't think these are overly complicated queries. We have some other ones we build that are looking at multiple ports, increase the query times

This cannot handle flow data from a single core switch, and unfortunately NNA is not going to be useful if we cannot even keep at least 24 to 48 hours of data and query against it. I am not sure what to do at this point. This hinders the XI integration as well, since the it asks for src or dst ip for all source groups within NNA.