Nagios Log Server listening port abruptly halts v2
-
james.liew
- Posts: 59
- Joined: Wed Feb 22, 2017 1:30 am
Nagios Log Server listening port abruptly halts v2
Referring back to here: https://support.nagios.com/forum/viewto ... 37&t=43502
The same server has once again halted listening on port 3515
The same server has once again halted listening on port 3515
You do not have the required permissions to view the files attached to this post.
-
james.liew
- Posts: 59
- Joined: Wed Feb 22, 2017 1:30 am
Re: Nagios Log Server listening port abruptly halts v2
Log file attached
You do not have the required permissions to view the files attached to this post.
Re: Nagios Log Server listening port abruptly halts v2
Here's something that sticks out:
Can you also share the Elasticsearch logs from this machine? Or all machines if there are multiple instances.
Code: Select all
{:timestamp=>"2017-05-11T02:26:06.903000+0200", :message=>"Got error to send bulk of actions: None of the configured nodes are available: []", :level=>:error}
{:timestamp=>"2017-05-11T02:26:06.922000+0200", :message=>"Failed to flush outgoing items", :outgoing_count=>9, :exception=>org.elasticsearch.client.transport.NoNodeAvailableException: None of the configured nodes are available etc etc ... }Former Nagios employee
https://www.mcapra.com/
https://www.mcapra.com/
Re: Nagios Log Server listening port abruptly halts v2
We're also seeing more memory related issues when it appears ES is trying to do a merge:
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:391)
at org.elasticsearch.index.merge.EnableMergeScheduler.merge(EnableMergeScheduler.java:50)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1985)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1979)
at org.elasticsearch.index.engine.InternalEngine.maybeMerge(InternalEngine.java:793)
at org.elasticsearch.index.shard.IndexShard$EngineMerger$1.run(IndexShard.java:1237)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
Check the size of the primary indices by running:
curl -XGET http://localhost:9200/_cat/indices?v
and looking at the pri.store.size column. Merging large indices may be too taxing for the system and you can try disabling optimization under Administration > System > Backup & Maintenance, by setting "Optimize Indexes older than" to 0.
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:391)
at org.elasticsearch.index.merge.EnableMergeScheduler.merge(EnableMergeScheduler.java:50)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1985)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1979)
at org.elasticsearch.index.engine.InternalEngine.maybeMerge(InternalEngine.java:793)
at org.elasticsearch.index.shard.IndexShard$EngineMerger$1.run(IndexShard.java:1237)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
Check the size of the primary indices by running:
curl -XGET http://localhost:9200/_cat/indices?v
and looking at the pri.store.size column. Merging large indices may be too taxing for the system and you can try disabling optimization under Administration > System > Backup & Maintenance, by setting "Optimize Indexes older than" to 0.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
james.liew
- Posts: 59
- Joined: Wed Feb 22, 2017 1:30 am
Re: Nagios Log Server listening port abruptly halts v2
Already shared both elasticsearch and logstash logs above.mcapra wrote:Here's something that sticks out:
Can you also share the Elasticsearch logs from this machine? Or all machines if there are multiple instances.Code: Select all
{:timestamp=>"2017-05-11T02:26:06.903000+0200", :message=>"Got error to send bulk of actions: None of the configured nodes are available: []", :level=>:error} {:timestamp=>"2017-05-11T02:26:06.922000+0200", :message=>"Failed to flush outgoing items", :outgoing_count=>9, :exception=>org.elasticsearch.client.transport.NoNodeAvailableException: None of the configured nodes are available etc etc ... }
I will include our 2nd node in this post.
You do not have the required permissions to view the files attached to this post.
-
james.liew
- Posts: 59
- Joined: Wed Feb 22, 2017 1:30 am
Re: Nagios Log Server listening port abruptly halts v2
I took a screencap as attached. Not of everything but what I could get from the top.cdienger wrote:We're also seeing more memory related issues when it appears ES is trying to do a merge:
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:391)
at org.elasticsearch.index.merge.EnableMergeScheduler.merge(EnableMergeScheduler.java:50)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1985)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1979)
at org.elasticsearch.index.engine.InternalEngine.maybeMerge(InternalEngine.java:793)
at org.elasticsearch.index.shard.IndexShard$EngineMerger$1.run(IndexShard.java:1237)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
Check the size of the primary indices by running:
curl -XGET http://localhost:9200/_cat/indices?v
and looking at the pri.store.size column. Merging large indices may be too taxing for the system and you can try disabling optimization under Administration > System > Backup & Maintenance, by setting "Optimize Indexes older than" to 0.
You do not have the required permissions to view the files attached to this post.
Re: Nagios Log Server listening port abruptly halts v2
Have you given this a shot?
Given that the last crash occurred during an optimization of indices, I think disabling this might be helpful. If you disable that job and the system is still unstable, could you provide fresh Elasticsearch logs?cdienger wrote:Merging large indices may be too taxing for the system and you can try disabling optimization under Administration > System > Backup & Maintenance, by setting "Optimize Indexes older than" to 0.
Former Nagios employee
https://www.mcapra.com/
https://www.mcapra.com/
-
james.liew
- Posts: 59
- Joined: Wed Feb 22, 2017 1:30 am
Re: Nagios Log Server listening port abruptly halts v2
Is it okay to do without the indice optimization?
Re: Nagios Log Server listening port abruptly halts v2
The benefit of optimization from what I've gathered is mainly reducing the amount of time needed during restarts and has little impact on search performance. It is okay to disable it.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
james.liew
- Posts: 59
- Joined: Wed Feb 22, 2017 1:30 am
Re: Nagios Log Server listening port abruptly halts v2
Changed it 3 days ago, still monitoring as of now.