Very Sluggish Web Interface
Very Sluggish Web Interface
Hello,
I am not even sure where to start on this issue. I would be glad to provide any additional information that may be useful.
The biggest visible issue is that the whole web interface is very slow. It takes over 60 seconds to log in. It takes another 20-30 seconds to pull up the Dashboards page, and it takes even longer for data to actually show up in the graphs on the default dashboard. Any searches or queries are also extremely slow.
It also appears that as of a few hours ago, NLS is no longer recording any data.
There was an event a few weeks back where this server ran out of disk space. I am not sure if that is related, but I did want to mention that.
I see many errors while looking through the elasticsearch logs. I think some of them are related to some bad filters putting the wrong type of data into certain fields, but there are also some others that I do not recognize.
I have attached the elasticsearch log from today that contains many errors.
Thank you for any insight you may be able to provide,
Max Farrior
I am not even sure where to start on this issue. I would be glad to provide any additional information that may be useful.
The biggest visible issue is that the whole web interface is very slow. It takes over 60 seconds to log in. It takes another 20-30 seconds to pull up the Dashboards page, and it takes even longer for data to actually show up in the graphs on the default dashboard. Any searches or queries are also extremely slow.
It also appears that as of a few hours ago, NLS is no longer recording any data.
There was an event a few weeks back where this server ran out of disk space. I am not sure if that is related, but I did want to mention that.
I see many errors while looking through the elasticsearch logs. I think some of them are related to some bad filters putting the wrong type of data into certain fields, but there are also some others that I do not recognize.
I have attached the elasticsearch log from today that contains many errors.
Thank you for any insight you may be able to provide,
Max Farrior
You do not have the required permissions to view the files attached to this post.
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Very Sluggish Web Interface
The log you posted is reporting many out of memory errors.
How much memory do you have allocated to this server? You may need to increase this
How much memory do you have allocated to this server? You may need to increase this
Re: Very Sluggish Web Interface
Thank you for the response.
The VM currently has 16 GB of memory.
I looked at the VM's memory usage in Nagios XI, the memory seems to start steadily increasing since around 8:30am this morning. I believe a colleague was using it at that time, he was the one that discovered these issues. Is the steady increase normal? I would think it should stop increasing once the query finishes. (Maybe the query didn't finish because of memory issues?)
I will make a request for the additional memory. However, we are currently running thin on memory, my request for 16GB received some resistance (the VM previously had 8GB).
Is there a way to reduce memory usage? Does having complex Logstash filters have a significant effect on memory usage?
The VM currently has 16 GB of memory.
I looked at the VM's memory usage in Nagios XI, the memory seems to start steadily increasing since around 8:30am this morning. I believe a colleague was using it at that time, he was the one that discovered these issues. Is the steady increase normal? I would think it should stop increasing once the query finishes. (Maybe the query didn't finish because of memory issues?)
I will make a request for the additional memory. However, we are currently running thin on memory, my request for 16GB received some resistance (the VM previously had 8GB).
Is there a way to reduce memory usage? Does having complex Logstash filters have a significant effect on memory usage?
You do not have the required permissions to view the files attached to this post.
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Very Sluggish Web Interface
This is a very high likelihood, especially of they were running a search spanning a large amount of data.NCATmax wrote:(Maybe the query didn't finish because of memory issues?)
The only real way to reduce this would be to close any indexes that aren't necessary for your searches, or increase the number of instances in your cluster (although this will require a larger license)NCATmax wrote:Is there a way to reduce memory usage?
This does have an impact but not too significant, the bigger impact is the amount of data you have in open indexes and the complexity of the queries you are running through the UINCATmax wrote:Does having complex Logstash filters have a significant effect on memory usage?
Re: Very Sluggish Web Interface
I do believe my colleage was searching over the last 30 days.
Let's assume I am limited to 16 GB of memory. Is there any best practice advice about how to minimize memory issues?
For example, NLS is currently configured to keep 30 days of indices open. (Each index is approximately 20-25 GB.) Would it make sense to reduce this to say 14 days?
A few of the filters that are being used create many new fields. Does having many new fields increase index size? And does it require more memory to search these larger indices? (I am wondering if I should use simpler filters so we can keep more indices open for searching.)
Let's assume I am limited to 16 GB of memory. Is there any best practice advice about how to minimize memory issues?
For example, NLS is currently configured to keep 30 days of indices open. (Each index is approximately 20-25 GB.) Would it make sense to reduce this to say 14 days?
A few of the filters that are being used create many new fields. Does having many new fields increase index size? And does it require more memory to search these larger indices? (I am wondering if I should use simpler filters so we can keep more indices open for searching.)
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Very Sluggish Web Interface
This would help a lotNCATmax wrote:Would it make sense to reduce this to say 14 days?
Yes and Yes.NCATmax wrote:A few of the filters that are being used create many new fields. Does having many new fields increase index size? And does it require more memory to search these larger indices? (I am wondering if I should use simpler filters so we can keep more indices open for searching.)
Re: Very Sluggish Web Interface
I see. In the short term, I will reduce the number of open indices to 14. And in the medium term, I'll look at reducing the number of additional fields the filters create, as well as trying to get some more memory added.
Speaking generally, is 14 days of data still useful or within a typical range? I know this completely depends on how NLS is used, but I remember thinking that 30 days sounded a little small. I have no point of reference.
Is the lack of memory the cause of the issue in my first post? Would simply restarting the NLS services address the issue? (Logstash, Elasticsearch, Apache)
Speaking generally, is 14 days of data still useful or within a typical range? I know this completely depends on how NLS is used, but I remember thinking that 30 days sounded a little small. I have no point of reference.
Is the lack of memory the cause of the issue in my first post? Would simply restarting the NLS services address the issue? (Logstash, Elasticsearch, Apache)
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Very Sluggish Web Interface
as you mentioned it all depends what your use case is, we have users that only keep one day, and we have others that need ready access to a years worth, albeit the later usually has a large amount of instances in their cluster with 64GB RAM in each instance and fast disks.NCATmax wrote:Speaking generally, is 14 days of data still useful or within a typical range? I know this completely depends on how NLS is used, but I remember thinking that 30 days sounded a little small. I have no point of reference.
Yes, that was what the error was showing in the log, and yes, that is what I would recommend to resolve it.NCATmax wrote:Is the lack of memory the cause of the issue in my first post? Would simply restarting the NLS services address the issue? (Logstash, Elasticsearch, Apache)
Re: Very Sluggish Web Interface
I just wanted to follow up on this.
Restarting all of the NLS services fixed the issue.
To limit memory consumption, I have configured NLS to only keep 14 indices open at once. As a secondary measure, I am going to revisit the filters I had created and try to remove any new fields that aren't really necessary.
Thank you for your help in resolving this.
Restarting all of the NLS services fixed the issue.
To limit memory consumption, I have configured NLS to only keep 14 indices open at once. As a secondary measure, I am going to revisit the filters I had created and try to remove any new fields that aren't really necessary.
Thank you for your help in resolving this.
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Very Sluggish Web Interface
Glad to hear it is resolvedNCATmax wrote:I just wanted to follow up on this.
Restarting all of the NLS services fixed the issue.
To limit memory consumption, I have configured NLS to only keep 14 indices open at once. As a secondary measure, I am going to revisit the filters I had created and try to remove any new fields that aren't really necessary.
Thank you for your help in resolving this.
Locking thread