Nagios Log Server listening port abruptly halts v2

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
james.liew
Posts: 59
Joined: Wed Feb 22, 2017 1:30 am

Nagios Log Server listening port abruptly halts v2

Post by james.liew »

Referring back to here: https://support.nagios.com/forum/viewto ... 37&t=43502

The same server has once again halted listening on port 3515
You do not have the required permissions to view the files attached to this post.
james.liew
Posts: 59
Joined: Wed Feb 22, 2017 1:30 am

Re: Nagios Log Server listening port abruptly halts v2

Post by james.liew »

Log file attached
You do not have the required permissions to view the files attached to this post.
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Nagios Log Server listening port abruptly halts v2

Post by mcapra »

Here's something that sticks out:

Code: Select all

{:timestamp=>"2017-05-11T02:26:06.903000+0200", :message=>"Got error to send bulk of actions: None of the configured nodes are available: []", :level=>:error}
{:timestamp=>"2017-05-11T02:26:06.922000+0200", :message=>"Failed to flush outgoing items", :outgoing_count=>9, :exception=>org.elasticsearch.client.transport.NoNodeAvailableException: None of the configured nodes are available etc etc ... }
Can you also share the Elasticsearch logs from this machine? Or all machines if there are multiple instances.
Former Nagios employee
https://www.mcapra.com/
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Nagios Log Server listening port abruptly halts v2

Post by cdienger »

We're also seeing more memory related issues when it appears ES is trying to do a merge:

java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:391)
at org.elasticsearch.index.merge.EnableMergeScheduler.merge(EnableMergeScheduler.java:50)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1985)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1979)
at org.elasticsearch.index.engine.InternalEngine.maybeMerge(InternalEngine.java:793)
at org.elasticsearch.index.shard.IndexShard$EngineMerger$1.run(IndexShard.java:1237)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)


Check the size of the primary indices by running:

curl -XGET http://localhost:9200/_cat/indices?v

and looking at the pri.store.size column. Merging large indices may be too taxing for the system and you can try disabling optimization under Administration > System > Backup & Maintenance, by setting "Optimize Indexes older than" to 0.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
james.liew
Posts: 59
Joined: Wed Feb 22, 2017 1:30 am

Re: Nagios Log Server listening port abruptly halts v2

Post by james.liew »

mcapra wrote:Here's something that sticks out:

Code: Select all

{:timestamp=>"2017-05-11T02:26:06.903000+0200", :message=>"Got error to send bulk of actions: None of the configured nodes are available: []", :level=>:error}
{:timestamp=>"2017-05-11T02:26:06.922000+0200", :message=>"Failed to flush outgoing items", :outgoing_count=>9, :exception=>org.elasticsearch.client.transport.NoNodeAvailableException: None of the configured nodes are available etc etc ... }
Can you also share the Elasticsearch logs from this machine? Or all machines if there are multiple instances.
Already shared both elasticsearch and logstash logs above.

I will include our 2nd node in this post.
You do not have the required permissions to view the files attached to this post.
james.liew
Posts: 59
Joined: Wed Feb 22, 2017 1:30 am

Re: Nagios Log Server listening port abruptly halts v2

Post by james.liew »

cdienger wrote:We're also seeing more memory related issues when it appears ES is trying to do a merge:

java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:391)
at org.elasticsearch.index.merge.EnableMergeScheduler.merge(EnableMergeScheduler.java:50)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1985)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1979)
at org.elasticsearch.index.engine.InternalEngine.maybeMerge(InternalEngine.java:793)
at org.elasticsearch.index.shard.IndexShard$EngineMerger$1.run(IndexShard.java:1237)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)


Check the size of the primary indices by running:

curl -XGET http://localhost:9200/_cat/indices?v

and looking at the pri.store.size column. Merging large indices may be too taxing for the system and you can try disabling optimization under Administration > System > Backup & Maintenance, by setting "Optimize Indexes older than" to 0.
I took a screencap as attached. Not of everything but what I could get from the top.
You do not have the required permissions to view the files attached to this post.
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Nagios Log Server listening port abruptly halts v2

Post by mcapra »

Have you given this a shot?
cdienger wrote:Merging large indices may be too taxing for the system and you can try disabling optimization under Administration > System > Backup & Maintenance, by setting "Optimize Indexes older than" to 0.
Given that the last crash occurred during an optimization of indices, I think disabling this might be helpful. If you disable that job and the system is still unstable, could you provide fresh Elasticsearch logs?
Former Nagios employee
https://www.mcapra.com/
james.liew
Posts: 59
Joined: Wed Feb 22, 2017 1:30 am

Re: Nagios Log Server listening port abruptly halts v2

Post by james.liew »

Is it okay to do without the indice optimization?
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Nagios Log Server listening port abruptly halts v2

Post by cdienger »

The benefit of optimization from what I've gathered is mainly reducing the amount of time needed during restarts and has little impact on search performance. It is okay to disable it.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
james.liew
Posts: 59
Joined: Wed Feb 22, 2017 1:30 am

Re: Nagios Log Server listening port abruptly halts v2

Post by james.liew »

Changed it 3 days ago, still monitoring as of now.
Locked