Very Sluggish Web Interface

NCATmax · Post by **NCATmax** » Wed Sep 23, 2020 11:38 am

Hello,

I am not even sure where to start on this issue. I would be glad to provide any additional information that may be useful.

The biggest visible issue is that the whole web interface is very slow. It takes over 60 seconds to log in. It takes another 20-30 seconds to pull up the Dashboards page, and it takes even longer for data to actually show up in the graphs on the default dashboard. Any searches or queries are also extremely slow.

It also appears that as of a few hours ago, NLS is no longer recording any data.

There was an event a few weeks back where this server ran out of disk space. I am not sure if that is related, but I did want to mention that.

I see many errors while looking through the elasticsearch logs. I think some of them are related to some bad filters putting the wrong type of data into certain fields, but there are also some others that I do not recognize.

I have attached the elasticsearch log from today that contains many errors.

Thank you for any insight you may be able to provide,
Max Farrior

scottwilkerson · Post by **scottwilkerson** » Wed Sep 23, 2020 11:45 am

The log you posted is reporting many out of memory errors.

How much memory do you have allocated to this server? You may need to increase this

NCATmax · Post by **NCATmax** » Wed Sep 23, 2020 12:01 pm

Thank you for the response.

The VM currently has 16 GB of memory.

I looked at the VM's memory usage in Nagios XI, the memory seems to start steadily increasing since around 8:30am this morning. I believe a colleague was using it at that time, he was the one that discovered these issues. Is the steady increase normal? I would think it should stop increasing once the query finishes. (Maybe the query didn't finish because of memory issues?)

I will make a request for the additional memory. However, we are currently running thin on memory, my request for 16GB received some resistance (the VM previously had 8GB).

Is there a way to reduce memory usage? Does having complex Logstash filters have a significant effect on memory usage?

scottwilkerson · Post by **scottwilkerson** » Wed Sep 23, 2020 12:12 pm

NCATmax wrote:(Maybe the query didn't finish because of memory issues?)

This is a very high likelihood, especially of they were running a search spanning a large amount of data.

NCATmax wrote:Is there a way to reduce memory usage?

The only real way to reduce this would be to close any indexes that aren't necessary for your searches, or increase the number of instances in your cluster (although this will require a larger license)

NCATmax wrote:Does having complex Logstash filters have a significant effect on memory usage?

This does have an impact but not too significant, the bigger impact is the amount of data you have in open indexes and the complexity of the queries you are running through the UI

NCATmax · Post by **NCATmax** » Wed Sep 23, 2020 12:26 pm

I do believe my colleage was searching over the last 30 days.

Let's assume I am limited to 16 GB of memory. Is there any best practice advice about how to minimize memory issues?

For example, NLS is currently configured to keep 30 days of indices open. (Each index is approximately 20-25 GB.) Would it make sense to reduce this to say 14 days?

A few of the filters that are being used create many new fields. Does having many new fields increase index size? And does it require more memory to search these larger indices? (I am wondering if I should use simpler filters so we can keep more indices open for searching.)

scottwilkerson · Post by **scottwilkerson** » Wed Sep 23, 2020 12:49 pm

NCATmax wrote:Would it make sense to reduce this to say 14 days?

This would help a lot

NCATmax wrote:A few of the filters that are being used create many new fields. Does having many new fields increase index size? And does it require more memory to search these larger indices? (I am wondering if I should use simpler filters so we can keep more indices open for searching.)

Yes and Yes.

NCATmax · Post by **NCATmax** » Wed Sep 23, 2020 3:05 pm

I see. In the short term, I will reduce the number of open indices to 14. And in the medium term, I'll look at reducing the number of additional fields the filters create, as well as trying to get some more memory added.

Speaking generally, is 14 days of data still useful or within a typical range? I know this completely depends on how NLS is used, but I remember thinking that 30 days sounded a little small. I have no point of reference.

Is the lack of memory the cause of the issue in my first post? Would simply restarting the NLS services address the issue? (Logstash, Elasticsearch, Apache)

scottwilkerson · Post by **scottwilkerson** » Wed Sep 23, 2020 4:18 pm

NCATmax wrote:Speaking generally, is 14 days of data still useful or within a typical range? I know this completely depends on how NLS is used, but I remember thinking that 30 days sounded a little small. I have no point of reference.

as you mentioned it all depends what your use case is, we have users that only keep one day, and we have others that need ready access to a years worth, albeit the later usually has a large amount of instances in their cluster with 64GB RAM in each instance and fast disks.

NCATmax wrote:Is the lack of memory the cause of the issue in my first post? Would simply restarting the NLS services address the issue? (Logstash, Elasticsearch, Apache)

Yes, that was what the error was showing in the log, and yes, that is what I would recommend to resolve it.

NCATmax · Post by **NCATmax** » Tue Sep 29, 2020 6:09 pm

I just wanted to follow up on this.

Restarting all of the NLS services fixed the issue.

To limit memory consumption, I have configured NLS to only keep 14 indices open at once. As a secondary measure, I am going to revisit the filters I had created and try to remove any new fields that aren't really necessary.

Thank you for your help in resolving this.

scottwilkerson · Post by **scottwilkerson** » Wed Sep 30, 2020 7:58 am

NCATmax wrote:I just wanted to follow up on this.

Restarting all of the NLS services fixed the issue.

To limit memory consumption, I have configured NLS to only keep 14 indices open at once. As a secondary measure, I am going to revisit the filters I had created and try to remove any new fields that aren't really necessary.

Thank you for your help in resolving this.

Glad to hear it is resolved

Locking thread

Nagios Support Forum

Very Sluggish Web Interface

Very Sluggish Web Interface

Re: Very Sluggish Web Interface

Re: Very Sluggish Web Interface

Re: Very Sluggish Web Interface

Re: Very Sluggish Web Interface

Re: Very Sluggish Web Interface

Re: Very Sluggish Web Interface

Re: Very Sluggish Web Interface

Re: Very Sluggish Web Interface

Re: Very Sluggish Web Interface