Elastic cluster going into a bad state

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
Jklre
Posts: 163
Joined: Wed May 28, 2014 1:56 pm

Elastic cluster going into a bad state

Post by Jklre »

About 3 times in the last week our Nagios Log server cluster has been going into a bad state. The web interface stops responding and we need to restart the elastic search service to get it to come back to life.

We are running Nagios Log Server (1.4.4)

We have 2 nodes with 4 cpu's each and 8gb of memory per node averaging 600,000 messages per 24 hours.
CPU.png
memory.png
Here's a snippet of the elastic search logs.

Code: Select all

[2017-09-18 08:58:01,706][DEBUG][action.search.type       ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] All shards failed for phase: [query_fetch]
org.elasticsearch.action.NoShardAvailableActionException: [nagioslogserver][0] null
	at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:160)
	at org.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction.doExecute(TransportSearchQueryAndFetchAction.java:57)
	at org.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction.doExecute(TransportSearchQueryAndFetchAction.java:47)
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75)
	at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:104)
	at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75)
	at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:98)
	at org.elasticsearch.client.FilterClient.execute(FilterClient.java:66)
	at org.elasticsearch.rest.BaseRestHandler$HeadersAndContextCopyClient.execute(BaseRestHandler.java:92)
	at org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:338)
	at org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:84)
	at org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:53)
	at org.elasticsearch.rest.RestController.executeHandler(RestController.java:225)
	at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:170)
	at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
	at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
	at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:327)
	at org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:63)
	at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
	at org.elasticsearch.http.netty.pipelining.HttpPipeliningHandler.messageReceived(HttpPipeliningHandler.java:60)
	at org.elasticsearch.common.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
	at org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
	at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
	at org.elasticsearch.common.netty.handler.codec.http.HttpContentDecoder.messageReceived(HttpContentDecoder.java:108)
	at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
	at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
	at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
	at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
	at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
	at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
	at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
	at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
	at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
	at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
	at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
	at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
	at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
	at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
	at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
	at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
[2017-09-18 08:58:01,710][DEBUG][action.search.type       ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] All shards failed for phase: [query_fetch]
org.elasticsearch.action.NoShardAvailableActionException: [nagioslogserver][0] null
	at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:160)
	at org.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction.doExecute(TransportSearchQueryAndFetchAction.java:57)
	at org.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction.doExecute(TransportSearchQueryAndFetchAction.java:47)
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75)
	at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:104)
	at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75)
	at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:98)
	at org.elasticsearch.client.FilterClient.execute(FilterClient.java:66)
	at org.elasticsearch.rest.BaseRestHandler$HeadersAndContextCopyClient.execute(BaseRestHandler.java:92)
	at org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:338)
	at org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:84)
	at org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:53)
	at org.elasticsearch.rest.RestController.executeHandler(RestController.java:225)
	at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:170)
	at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
	at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
	at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:327)
	at org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:63)
	at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
	at org.elasticsearch.http.netty.pipelining.HttpPipeliningHandler.messageReceived(HttpPipeliningHandler.java:60)
	at org.elasticsearch.common.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
	at org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
	at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
	at org.elasticsearch.common.netty.handler.codec.http.HttpContentDecoder.messageReceived(HttpContentDecoder.java:108)
	at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
	at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
	at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
	at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
	at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
	at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
	at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
	at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
	at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
	at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
	at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
	at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
	at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
	at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
	at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
	at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
[2017-09-18 08:58:01,829][DEBUG][action.index             ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] observer: timeout notification from cluster service. timeout setting [1m], time since start [1m]
Also the logstash

Code: Select all

{:timestamp=>"2017-09-18T08:55:48.150000-0700", :message=>"Got error to send bulk of actions: None of the configured nodes are available: []", :level=>:error}
{:timestamp=>"2017-09-18T08:55:48.150000-0700", :message=>"Failed to flush outgoing items", :outgoing_count=>330, :exception=>org.elasticsearch.client.transport.NoNodeAvailableException: None of the configured nodes are available: [], :backtrace=>["org.elasticsearch.client.transport.TransportClientNodesService.ensureNodesAreAvailable(org/elasticsearch/client/transport/TransportClientNodesService.java:279)", "org.elasticsearch.client.transport.TransportClientNodesService.execute(org/elasticsearch/client/transport/TransportClientNodesService.java:198)", "org.elasticsearch.client.transport.support.InternalTransportClient.execute(org/elasticsearch/client/transport/support/InternalTransportClient.java:106)", "org.elasticsearch.client.support.AbstractClient.bulk(org/elasticsearch/client/support/AbstractClient.java:163)", "org.elasticsearch.client.transport.TransportClient.bulk(org/elasticsearch/client/transport/TransportClient.java:356)", "org.elasticsearch.action.bulk.BulkRequestBuilder.doExecute(org/elasticsearch/action/bulk/BulkRequestBuilder.java:164)", "org.elasticsearch.action.ActionRequestBuilder.execute(org/elasticsearch/action/ActionRequestBuilder.java:91)", "org.elasticsearch.action.ActionRequestBuilder.execute(org/elasticsearch/action/ActionRequestBuilder.java:65)", "java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:606)", "LogStash::Outputs::Elasticsearch::Protocols::NodeClient.bulk(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.2.8-java/lib/logstash/outputs/elasticsearch/protocol.rb:224)", "LogStash::Outputs::Elasticsearch::Protocols::NodeClient.bulk(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.2.8-java/lib/logstash/outputs/elasticsearch/protocol.rb:224)", "LogStash::Outputs::ElasticSearch.submit(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.2.8-java/lib/logstash/outputs/elasticsearch.rb:466)", "LogStash::Outputs::ElasticSearch.submit(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.2.8-java/lib/logstash/outputs/elasticsearch.rb:466)", "LogStash::Outputs::ElasticSearch.submit(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.2.8-java/lib/logstash/outputs/elasticsearch.rb:465)", "LogStash::Outputs::ElasticSearch.submit(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.2.8-java/lib/logstash/outputs/elasticsearch.rb:465)", "LogStash::Outputs::ElasticSearch.flush(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.2.8-java/lib/logstash/outputs/elasticsearch.rb:490)", "LogStash::Outputs::ElasticSearch.flush(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.2.8-java/lib/logstash/outputs/elasticsearch.rb:490)", "LogStash::Outputs::ElasticSearch.flush(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.2.8-java/lib/logstash/outputs/elasticsearch.rb:489)", "LogStash::Outputs::ElasticSearch.flush(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.2.8-java/lib/logstash/outputs/elasticsearch.rb:489)", "Stud::Buffer.buffer_flush(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.19/lib/stud/buffer.rb:219)", "Stud::Buffer.buffer_flush(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.19/lib/stud/buffer.rb:219)", "org.jruby.RubyHash.each(org/jruby/RubyHash.java:1341)", "Stud::Buffer.buffer_flush(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.19/lib/stud/buffer.rb:216)", "Stud::Buffer.buffer_flush(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.19/lib/stud/buffer.rb:216)", "Stud::Buffer.buffer_flush(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.19/lib/stud/buffer.rb:193)", "Stud::Buffer.buffer_flush(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.19/lib/stud/buffer.rb:193)", "RUBY.buffer_initialize(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.19/lib/stud/buffer.rb:112)", "org.jruby.RubyKernel.loop(org/jruby/RubyKernel.java:1511)", "RUBY.buffer_initialize(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.19/lib/stud/buffer.rb:110)", "java.lang.Thread.run(java/lang/Thread.java:745)"], :level=>:warn}
You do not have the required permissions to view the files attached to this post.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Elastic cluster going into a bad state

Post by cdienger »

Elasticsearch is limited to half of a system's total memory so of the 8 it can only work with 4 which is likely too little. If you look a little further back in the Elasticsearch logs prior to the All shards failed for phase message, do you see any memory errors? This would be a smoking gun if so.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Jklre
Posts: 163
Joined: Wed May 28, 2014 1:56 pm

Re: Elastic cluster going into a bad state

Post by Jklre »

Looks like a lot of GC operations.

Code: Select all

[2017-09-18 08:41:02,405][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288149][33485] duration [10.5s], collections [1]/[10.8s], total [10.5s]/[16.5h], memory [3.7gb]->[3.7gb]/[3.8gb], all_pools {[young] [159.7mb]->[167.3mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:41:10,791][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288150][33486] duration [8s], collections [1]/[8.3s], total [8s]/[16.5h], memory [3.7gb]->[3.6gb]/[3.8gb], all_pools {[young] [167.3mb]->[61.9mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:41:22,560][WARN ][cluster.service          ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] cluster state update task [routing-table-updater] took 1m above the warn threshold of 30s
[2017-09-18 08:41:22,560][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288152][33487] duration [10.7s], collections [1]/[10.7s], total [10.7s]/[16.5h], memory [3.8gb]->[3.6gb]/[3.8gb], all_pools {[young] [266.2mb]->[69.1mb]/[266.2mb]}{[survivor] [31.9mb]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:41:31,815][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288153][33488] duration [8.2s], collections [1]/[9.2s], total [8.2s]/[16.5h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [69.1mb]->[83.1mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:41:44,352][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288155][33489] duration [11.4s], collections [1]/[11.5s], total [11.4s]/[16.5h], memory [3.8gb]->[3.6gb]/[3.8gb], all_pools {[young] [266.2mb]->[84.9mb]/[266.2mb]}{[survivor] [28.8mb]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:41:56,681][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 9969 numDocs: 9969 vs. true
[2017-09-18 08:41:56,683][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 5690 numDocs: 5690 vs. true
[2017-09-18 08:41:56,685][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288156][33490] duration [11.7s], collections [1]/[12.3s], total [11.7s]/[16.5h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [84.9mb]->[76.1mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:42:05,442][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288157][33491] duration [8.2s], collections [1]/[8.7s], total [8.2s]/[16.5h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [76.1mb]->[88.7mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:42:17,672][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 78285 numDocs: 78285 vs. true
[2017-09-18 08:42:17,673][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 33602 numDocs: 33602 vs. true
[2017-09-18 08:42:17,687][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288158][33492] duration [11.4s], collections [1]/[12.2s], total [11.4s]/[16.5h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [88.7mb]->[96mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:42:28,584][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288159][33493] duration [10.4s], collections [1]/[10.8s], total [10.4s]/[16.5h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [96mb]->[90.6mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:42:37,426][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288160][33494] duration [8.1s], collections [1]/[8.8s], total [8.1s]/[16.5h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [90.6mb]->[88.3mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:42:49,094][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288161][33495] duration [10.8s], collections [1]/[11.6s], total [10.8s]/[16.5h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [88.3mb]->[110.8mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:43:01,934][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288162][33496] duration [12s], collections [1]/[12.8s], total [12s]/[16.5h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [110.8mb]->[97.1mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:43:09,538][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288163][33497] duration [6.8s], collections [1]/[7.6s], total [6.8s]/[16.6h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [97.1mb]->[92.2mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:43:20,623][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 8295 numDocs: 8295 vs. true
[2017-09-18 08:43:20,639][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 3450 numDocs: 3450 vs. true
[2017-09-18 08:43:20,639][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288164][33498] duration [10.3s], collections [1]/[11.1s], total [10.3s]/[16.6h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [92.2mb]->[97.6mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:43:20,664][WARN ][cluster.service          ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] cluster state update task [shard-started ([logstash-2017.07.15][1], node[4KlaIgx3Sdu8dSa9NVCUQQ], [R], s[INITIALIZING]), reason [after recovery (replica) from node [[41a07432-8d31-4259-a3d5-9ba9c0379bad][L9BZoCYaT2i93EPP8ph0_g][pnls01lxv.mitchell.com][inet[/172.24.25.135:9300]]{max_local_storage_nodes=1}]]] took 30.9s above the warn threshold of 30s
[2017-09-18 08:43:33,259][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288165][33499] duration [12.1s], collections [1]/[12.6s], total [12.1s]/[16.6h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [97.6mb]->[115.2mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:43:41,886][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288166][33500] duration [8.3s], collections [1]/[8.6s], total [8.3s]/[16.6h], memory [3.6gb]->[3.7gb]/[3.8gb], all_pools {[young] [115.2mb]->[190.8mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:43:54,423][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288167][33501] duration [12.1s], collections [1]/[12.5s], total [12.1s]/[16.6h], memory [3.7gb]->[3.6gb]/[3.8gb], all_pools {[young] [190.8mb]->[96.8mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:43:54,513][WARN ][cluster.service          ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] cluster state update task [async_shard_fetch] took 33.8s above the warn threshold of 30s
[2017-09-18 08:44:03,267][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288168][33502] duration [8.3s], collections [1]/[8.8s], total [8.3s]/[16.6h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [96.8mb]->[107.2mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:44:14,787][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288169][33503] duration [10.9s], collections [1]/[11.5s], total [10.9s]/[16.6h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [107.2mb]->[112.4mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:44:22,470][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 15060 numDocs: 15060 vs. true
[2017-09-18 08:44:22,573][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288170][33504] duration [7.3s], collections [1]/[7.7s], total [7.3s]/[16.6h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [112.4mb]->[106.3mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:44:33,852][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288171][33505] duration [10.6s], collections [1]/[11.2s], total [10.6s]/[16.6h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [106.3mb]->[112.4mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:44:41,804][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288172][33506] duration [7.1s], collections [1]/[7.9s], total [7.1s]/[16.6h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [112.4mb]->[105mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:44:41,808][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 6079 numDocs: 6079 vs. true
[2017-09-18 08:44:55,814][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288173][33507] duration [13.1s], collections [1]/[13.9s], total [13.1s]/[16.6h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [105mb]->[121.4mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:45:03,499][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288174][33508] duration [7.1s], collections [1]/[7.7s], total [7.1s]/[16.6h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [121.4mb]->[120.8mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:45:15,507][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288175][33509] duration [11.1s], collections [1]/[12s], total [11.1s]/[16.6h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [120.8mb]->[115.9mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:45:15,746][WARN ][cluster.service          ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] cluster state update task [async_shard_fetch] took 33.1s above the warn threshold of 30s
[2017-09-18 08:45:23,666][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288176][33510] duration [7.2s], collections [1]/[8.1s], total [7.2s]/[16.6h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [115.9mb]->[114.5mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:45:36,478][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288177][33511] duration [11.8s], collections [1]/[12.8s], total [11.8s]/[16.6h], memory [3.6gb]->[3.7gb]/[3.8gb], all_pools {[young] [114.5mb]->[134.4mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:45:43,969][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 82813 numDocs: 82813 vs. true
[2017-09-18 08:45:43,970][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288178][33512] duration [6.8s], collections [1]/[7.4s], total [6.8s]/[16.6h], memory [3.7gb]->[3.6gb]/[3.8gb], all_pools {[young] [134.4mb]->[105.2mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:45:43,976][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 16629 numDocs: 16629 vs. true
[2017-09-18 08:45:57,074][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288179][33513] duration [12.4s], collections [1]/[13.1s], total [12.4s]/[16.6h], memory [3.6gb]->[3.7gb]/[3.8gb], all_pools {[young] [105.2mb]->[122.4mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:46:09,113][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288180][33514] duration [11.3s], collections [1]/[12s], total [11.3s]/[16.6h], memory [3.7gb]->[3.6gb]/[3.8gb], all_pools {[young] [122.4mb]->[110.9mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:46:16,852][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288181][33515] duration [6.9s], collections [1]/[7.7s], total [6.9s]/[16.6h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [110.9mb]->[109.7mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:46:29,872][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288182][33516] duration [12s], collections [1]/[13s], total [12s]/[16.6h], memory [3.6gb]->[3.7gb]/[3.8gb], all_pools {[young] [109.7mb]->[142mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:46:30,547][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 13229 numDocs: 13229 vs. true
[2017-09-18 08:46:30,615][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 84952 numDocs: 84952 vs. true
[2017-09-18 08:46:37,706][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288183][33517] duration [7s], collections [1]/[7.8s], total [7s]/[16.6h], memory [3.7gb]->[3.6gb]/[3.8gb], all_pools {[young] [142mb]->[115.7mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:46:50,876][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288184][33518] duration [12.3s], collections [1]/[13.1s], total [12.3s]/[16.6h], memory [3.6gb]->[3.7gb]/[3.8gb], all_pools {[young] [115.7mb]->[122.2mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:47:02,633][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288185][33519] duration [11.1s], collections [1]/[11.7s], total [11.1s]/[16.6h], memory [3.7gb]->[3.7gb]/[3.8gb], all_pools {[young] [122.2mb]->[128.7mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:47:10,512][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288186][33520] duration [7.4s], collections [1]/[7.8s], total [7.4s]/[16.6h], memory [3.7gb]->[3.7gb]/[3.8gb], all_pools {[young] [128.7mb]->[124.3mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:47:21,692][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288187][33521] duration [10.6s], collections [1]/[11.1s], total [10.6s]/[16.6h], memory [3.7gb]->[3.7gb]/[3.8gb], all_pools {[young] [124.3mb]->[130.1mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:47:21,718][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 67795 numDocs: 67795 vs. true
[2017-09-18 08:47:21,732][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 79060 numDocs: 79060 vs. true
[2017-09-18 08:47:21,743][WARN ][cluster.service          ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] cluster state update task [shard-started ([logstash-2016.01.07][1], node[4KlaIgx3Sdu8dSa9NVCUQQ], [R], s[INITIALIZING]), reason [after recovery (replica) from node [[41a07432-8d31-4259-a3d5-9ba9c0379bad][L9BZoCYaT2i93EPP8ph0_g][pnls01lxv.mitchell.com][inet[/172.24.25.135:9300]]{max_local_storage_nodes=1}]]] took 30.5s above the warn threshold of 30s
[2017-09-18 08:47:29,096][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288188][33522] duration [6.8s], collections [1]/[7.4s], total [6.8s]/[16.6h], memory [3.7gb]->[3.6gb]/[3.8gb], all_pools {[young] [130.1mb]->[120.2mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:47:42,079][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288190][33523] duration [11.7s], collections [1]/[11.9s], total [11.7s]/[16.6h], memory [3.8gb]->[3.7gb]/[3.8gb], all_pools {[young] [266.2mb]->[128.4mb]/[266.2mb]}{[survivor] [17.9mb]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:47:50,104][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288191][33524] duration [7.2s], collections [1]/[8s], total [7.2s]/[16.6h], memory [3.7gb]->[3.7gb]/[3.8gb], all_pools {[young] [128.4mb]->[123.4mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:47:51,125][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 5561 numDocs: 5561 vs. true
[2017-09-18 08:48:02,327][WARN ][cluster.service          ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] cluster state update task [shard-started ([logstash-2017.06.22][4], node[4KlaIgx3Sdu8dSa9NVCUQQ], [R], s[INITIALIZING]), reason [after recovery (replica) from node [[41a07432-8d31-4259-a3d5-9ba9c0379bad][L9BZoCYaT2i93EPP8ph0_g][pnls01lxv.mitchell.com][inet[/172.24.25.135:9300]]{max_local_storage_nodes=1}]]] took 32.4s above the warn threshold of 30s
[2017-09-18 08:48:02,355][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288193][33525] duration [11.1s], collections [1]/[11.2s], total [11.1s]/[16.6h], memory [3.8gb]->[3.7gb]/[3.8gb], all_pools {[young] [266.2mb]->[147.2mb]/[266.2mb]}{[survivor] [26.4mb]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:48:11,481][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288194][33526] duration [8.5s], collections [1]/[9.1s], total [8.5s]/[16.6h], memory [3.7gb]->[3.6gb]/[3.8gb], all_pools {[young] [147.2mb]->[117.7mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:48:23,064][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288195][33527] duration [10.8s], collections [1]/[11.5s], total [10.8s]/[16.6h], memory [3.6gb]->[3.7gb]/[3.8gb], all_pools {[young] [117.7mb]->[125.3mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:48:31,034][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288196][33528] duration [7.3s], collections [1]/[7.9s], total [7.3s]/[16.6h], memory [3.7gb]->[3.6gb]/[3.8gb], all_pools {[young] [125.3mb]->[117mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:48:31,037][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 285 numDocs: 285 vs. true
[2017-09-18 08:48:45,587][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288197][33529] duration [13.6s], collections [1]/[14.5s], total [13.6s]/[16.6h], memory [3.6gb]->[3.7gb]/[3.8gb], all_pools {[young] [117mb]->[123.3mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:48:59,153][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288198][33530] duration [12.7s], collections [1]/[13.5s], total [12.7s]/[16.6h], memory [3.7gb]->[3.7gb]/[3.8gb], all_pools {[young] [123.3mb]->[129.4mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:48:59,972][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 9677 numDocs: 9677 vs. true
[2017-09-18 08:49:08,688][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288199][33531] duration [8.6s], collections [1]/[9.5s], total [8.6s]/[16.6h], memory [3.7gb]->[3.7gb]/[3.8gb], all_pools {[young] [129.4mb]->[125.1mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:49:21,744][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288200][33532] duration [12s], collections [1]/[13s], total [12s]/[16.6h], memory [3.7gb]->[3.7gb]/[3.8gb], all_pools {[young] [125.1mb]->[125.1mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:49:22,671][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 10485 numDocs: 10485 vs. true
[2017-09-18 08:49:34,710][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288201][33533] duration [12s], collections [1]/[12.9s], total [12s]/[16.7h], memory [3.7gb]->[3.7gb]/[3.8gb], all_pools {[young] [125.1mb]->[145.7mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:49:42,846][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288202][33534] duration [7.8s], collections [1]/[8.1s], total [7.8s]/[16.7h], memory [3.7gb]->[3.6gb]/[3.8gb], all_pools {[young] [145.7mb]->[107.7mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:49:43,023][WARN ][cluster.service          ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] cluster state update task [shard-started ([logstash-2015.01.11][2], node[4KlaIgx3Sdu8dSa9NVCUQQ], [R], s[INITIALIZING]), reason [after recovery (replica) from node [[41a07432-8d31-4259-a3d5-9ba9c0379bad][L9BZoCYaT2i93EPP8ph0_g][pnls01lxv.mitchell.com][inet[/172.24.25.135:9300]]{max_local_storage_nodes=1}]]] took 42.9s above the warn threshold of 30s
[2017-09-18 08:49:56,239][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288203][33535] duration [12.9s], collections [1]/[13.3s], total [12.9s]/[16.7h], memory [3.6gb]->[3.7gb]/[3.8gb], all_pools {[young] [107.7mb]->[163.9mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:50:03,868][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288204][33536] duration [7.2s], collections [1]/[7.6s], total [7.2s]/[16.7h], memory [3.7gb]->[3.7gb]/[3.8gb], all_pools {[young] [163.9mb]->[132.1mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:50:15,678][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288205][33537] duration [11s], collections [1]/[11.8s], total [11s]/[16.7h], memory [3.7gb]->[3.6gb]/[3.8gb], all_pools {[young] [132.1mb]->[119.6mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:50:23,070][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288206][33538] duration [6.9s], collections [1]/[7.3s], total [6.9s]/[16.7h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [119.6mb]->[113mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:50:23,229][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 2884 numDocs: 2884 vs. true
[2017-09-18 08:50:23,269][WARN ][cluster.service          ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] cluster state update task [shard-started ([logstash-2016.01.04][4], node[4KlaIgx3Sdu8dSa9NVCUQQ], [R], s[INITIALIZING]), reason [after recovery (replica) from node [[41a07432-8d31-4259-a3d5-9ba9c0379bad][L9BZoCYaT2i93EPP8ph0_g][pnls01lxv.mitchell.com][inet[/172.24.25.135:9300]]{max_local_storage_nodes=1}]]] took 40.2s above the warn threshold of 30s
[2017-09-18 08:50:34,656][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288207][33539] duration [10.9s], collections [1]/[11.5s], total [10.9s]/[16.7h], memory [3.6gb]->[3.7gb]/[3.8gb], all_pools {[young] [113mb]->[124.3mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:50:35,276][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 5978 numDocs: 5978 vs. true
[2017-09-18 08:50:48,025][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288209][33540] duration [12.2s], collections [1]/[12.3s], total [12.2s]/[16.7h], memory [3.8gb]->[3.6gb]/[3.8gb], all_pools {[young] [266.2mb]->[104.2mb]/[266.2mb]}{[survivor] [13.2mb]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:50:48,729][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 5408 numDocs: 5408 vs. true
[2017-09-18 08:50:57,533][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288210][33541] duration [8.7s], collections [1]/[9.5s], total [8.7s]/[16.7h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [104.2mb]->[121mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:51:09,469][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288212][33542] duration [10.7s], collections [1]/[10.9s], total [10.7s]/[16.7h], memory [3.8gb]->[3.6gb]/[3.8gb], all_pools {[young] [266.2mb]->[107.5mb]/[266.2mb]}{[survivor] [11mb]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:51:09,480][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 11495 numDocs: 11495 vs. true
[2017-09-18 08:51:22,633][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288213][33543] duration [12.6s], collections [1]/[13.1s], total [12.6s]/[16.7h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [107.5mb]->[111.8mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:51:30,040][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288214][33544] duration [7s], collections [1]/[7.4s], total [7s]/[16.7h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [111.8mb]->[113.4mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:51:30,054][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 296 numDocs: 296 vs. true
[2017-09-18 08:51:43,369][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288215][33545] duration [12.7s], collections [1]/[13.3s], total [12.7s]/[16.7h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [113.4mb]->[117.4mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:51:52,531][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288216][33546] duration [8.6s], collections [1]/[9.1s], total [8.6s]/[16.7h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [117.4mb]->[111.8mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:51:52,540][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 10652 numDocs: 10652 vs. true
[2017-09-18 08:52:05,811][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288217][33547] duration [12.4s], collections [1]/[13.2s], total [12.4s]/[16.7h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [111.8mb]->[105.5mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:52:06,001][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 84283 numDocs: 84283 vs. true
[2017-09-18 08:52:19,316][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288219][33548] duration [12.4s], collections [1]/[12.5s], total [12.4s]/[16.7h], memory [3.8gb]->[3.6gb]/[3.8gb], all_pools {[young] [266.2mb]->[102.9mb]/[266.2mb]}{[survivor] [26.5mb]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:52:19,323][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 85077 numDocs: 85077 vs. true
[2017-09-18 08:52:21,129][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 7416 numDocs: 7416 vs. true
[2017-09-18 08:52:34,108][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288221][33549] duration [12.9s], collections [1]/[13.7s], total [12.9s]/[16.7h], memory [3.7gb]->[3.6gb]/[3.8gb], all_pools {[young] [205mb]->[114.3mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:52:47,819][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288223][33550] duration [12.6s], collections [1]/[12.7s], total [12.6s]/[16.7h], memory [3.8gb]->[3.6gb]/[3.8gb], all_pools {[young] [266.2mb]->[117.8mb]/[266.2mb]}{[survivor] [33.2mb]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:52:48,594][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 82103 numDocs: 82103 vs. true
[2017-09-18 08:52:57,136][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288224][33551] duration [8.3s], collections [1]/[9.3s], total [8.3s]/[16.7h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [117.8mb]->[113.9mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:52:58,328][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 9412 numDocs: 9412 vs. true
[2017-09-18 08:52:58,660][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 67268 numDocs: 67268 vs. true
[2017-09-18 08:53:10,592][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 5262 numDocs: 5262 vs. true
[2017-09-18 08:53:10,596][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288227][33552] duration [10.7s], collections [1]/[11.4s], total [10.7s]/[16.7h], memory [3.8gb]->[3.6gb]/[3.8gb], all_pools {[young] [266.2mb]->[101.4mb]/[266.2mb]}{[survivor] [9.3mb]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:53:22,296][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288228][33553] duration [11.1s], collections [1]/[11.7s], total [11.1s]/[16.7h], memory [3.6gb]->[3.7gb]/[3.8gb], all_pools {[young] [101.4mb]->[159.8mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:53:29,223][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 11500 numDocs: 11500 vs. true
[2017-09-18 08:53:29,251][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288229][33554] duration [6.7s], collections [1]/[6.9s], total [6.7s]/[16.7h], memory [3.7gb]->[3.8gb]/[3.8gb], all_pools {[young] [159.8mb]->[235.9mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:53:40,747][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288230][33555] duration [11.2s], collections [1]/[11.4s], total [11.2s]/[16.7h], memory [3.8gb]->[3.6gb]/[3.8gb], all_pools {[young] [235.9mb]->[102.7mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:53:40,823][WARN ][cluster.service          ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] cluster state update task [shard-started ([logstash-2017.02.27][4], node[4KlaIgx3Sdu8dSa9NVCUQQ], [R], s[INITIALIZING]), reason [after recovery (replica) from node [[41a07432-8d31-4259-a3d5-9ba9c0379bad][L9BZoCYaT2i93EPP8ph0_g][pnls01lxv.mitchell.com][inet[/172.24.25.135:9300]]{max_local_storage_nodes=1}]]] took 41.1s above the warn threshold of 30s
[2017-09-18 08:53:49,807][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288231][33556] duration [8.5s], collections [1]/[9s], total [8.5s]/[16.7h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [102.7mb]->[103.9mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:54:02,591][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288232][33557] duration [12.1s], collections [1]/[12.7s], total [12.1s]/[16.7h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [103.9mb]->[108.2mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:54:02,605][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 6925 numDocs: 6925 vs. true
[2017-09-18 08:54:02,607][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 9830 numDocs: 9830 vs. true
[2017-09-18 08:54:12,119][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288233][33558] duration [8.8s], collections [1]/[9.5s], total [8.8s]/[16.7h], memory [3.6gb]->[3.6gb]/[3.8gb], all_pools {[young] [108.2mb]->[113.7mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:54:25,620][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 11511 numDocs: 11511 vs. true
[2017-09-18 08:54:25,656][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288234][33559] duration [12.5s], collections [1]/[13.5s], total [12.5s]/[16.7h], memory [3.6gb]->[3.7gb]/[3.8gb], all_pools {[young] [113.7mb]->[127.4mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:54:35,196][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288235][33560] duration [8.6s], collections [1]/[9.5s], total [8.6s]/[16.7h], memory [3.7gb]->[3.6gb]/[3.8gb], all_pools {[young] [127.4mb]->[118.2mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:54:46,979][WARN ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288236][33561] duration [11.1s], collections [1]/[11.7s], total [11.1s]/[16.7h], memory [3.6gb]->[3.7gb]/[3.8gb], all_pools {[young] [118.2mb]->[133.7mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:54:55,847][INFO ][indices.recovery         ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] Recovery with sync ID 11950 numDocs: 11950 vs. true
[2017-09-18 08:54:55,998][INFO ][node                     ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] stopping ...
[2017-09-18 08:54:56,007][INFO ][monitor.jvm              ] [41a07432-8d31-4259-a3d5-9ba9c0379bad] [gc][old][288237][33562] duration [8.5s], collections [1]/[9s], total [8.5s]/[16.7h], memory [3.7gb]->[3.7gb]/[3.8gb], all_pools {[young] [133.7mb]->[128.8mb]/[266.2mb]}{[survivor] [0b]->[0b]/[33.2mb]}{[old] [3.5gb]->[3.5gb]/[3.5gb]}
[2017-09-18 08:54:56,158][WARN ][netty.channel.DefaultChannelPipeline] An exception was thrown by an exception handler.
java.util.concurrent.RejectedExecutionException: Worker has already been shutdown
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Elastic cluster going into a bad state

Post by cdienger »

We may need to go back to around the time those Recovery with sync ID messages started. Can you PM me the /var/log/elasticsearch/* directory?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Jklre
Posts: 163
Joined: Wed May 28, 2014 1:56 pm

Re: Elastic cluster going into a bad state

Post by Jklre »

PM sent. I went ahead and added 2gb of memory to both these nodes.

Thank you.
User avatar
tacolover101
Posts: 432
Joined: Mon Apr 10, 2017 11:55 am

Re: Elastic cluster going into a bad state

Post by tacolover101 »

how much data do you currently have in open indices?

pending where this number is at, you may need to increase the ram further for better performance.
Jklre
Posts: 163
Joined: Wed May 28, 2014 1:56 pm

Re: Elastic cluster going into a bad state

Post by Jklre »

Cluster Statistics

322,553,795
Documents

131.8GB
Primary Size

197.6GB
Total Size

2
Data Instances

10242
Total Shards

1025
Indices
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Elastic cluster going into a bad state

Post by tmcdonald »

It's usually memory with Log Server, historically speaking:

https://support.nagios.com/forum/viewto ... 37&t=33519

I would specifically draw your attention to this section:
DC6171 wrote:

Code: Select all

[root@logserver01 elasticsearch]# cat /etc/sysconfig/elasticsearch
# Directory where the Elasticsearch binary distribution resides
APP_DIR="/usr/local/nagioslogserver"
ES_HOME="$APP_DIR/elasticsearch"

# Heap Size (defaults to 256m min, 1g max)
# Nagios Log Server Default to 0.5 physical Memory
ES_HEAP_SIZE=$(expr $(free -m|awk '/^Mem:/{print $2}') / 2 )m
which basically means that the Elasticsearch heap memory is set to half the system memory, which on an 8G system is only 4G. Plus, since you have only two nodes and they need to replicate data between themselves, each is essentially holding a full copy of all the logs. If you added a third (or more) instance, it would take on some of this burden. As you will see in the next link, you can scale up initially to a point before scaling out.

Starting on page 26 of this presentation, there are some recommendations given for performance tweaking:

https://www.slideshare.net/nagiosinc/da ... experience
Former Nagios employee
Locked