Page 1 of 1

logstash dying

Posted: Fri Oct 14, 2016 11:42 am
by vmesquita
The logstash process keeps dying in one of the nodes, I restart it and after a while it dies again. This time I left the console open and got those messages:

Code: Select all

Oct 14, 2016 12:08:34 PM org.elasticsearch.plugins.PluginsService <init>
INFO: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] loaded [], sites []
Oct 14, 2016 12:08:37 PM org.elasticsearch.plugins.PluginsService <init>
INFO: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] loaded [], sites []
Oct 14, 2016 12:08:37 PM org.elasticsearch.plugins.PluginsService <init>
INFO: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] loaded [], sites []
Oct 14, 2016 12:08:37 PM org.elasticsearch.plugins.PluginsService <init>
INFO: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] loaded [], sites []
Oct 14, 2016 12:08:37 PM org.elasticsearch.plugins.PluginsService <init>
INFO: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] loaded [], sites []
Exception in thread "Ruby-0-Thread-46: /usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.2.8-java/lib/logstash/outputs/elasticsearch.rb:406" Exception in thread "elasticsearch[6a7ce4ea-e1b9-47a1-af18-1c4d47243d20][generic][T#1]" Exception in thread "Ruby-0-Thread-39: /usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.2.8-java/lib/logstash/outputs/elasticsearch.rb:406" Oct 14, 2016 12:30:54 PM org.elasticsearch.client.transport.TransportClientNodesService$SimpleNodeSampler doSample
INFO: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] failed to get node info for [#transport#-1][sa585][inet[localhost/127.0.0.1:9300]], disconnecting...
java.lang.OutOfMemoryError: GC overhead limit exceeded

java.lang.ArrayIndexOutOfBoundsException: -1
        at org.jruby.runtime.ThreadContext.popRubyClass(ThreadContext.java:702)
        at org.jruby.runtime.ThreadContext.postYield(ThreadContext.java:1269)
        at org.jruby.runtime.ContextAwareBlockBody.post(ContextAwareBlockBody.java:29)
        at org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:198)
        at org.jruby.runtime.Interpreted19Block.call(Interpreted19Block.java:125)
        at org.jruby.runtime.Block.call(Block.java:101)
        at org.jruby.RubyProc.call(RubyProc.java:290)
        at org.jruby.RubyProc.call(RubyProc.java:228)
        at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:99)
        at java.lang.Thread.run(Thread.java:745)
java.lang.ArrayIndexOutOfBoundsException: -1
        at org.jruby.runtime.ThreadContext.popRubyClass(ThreadContext.java:702)
        at org.jruby.runtime.ThreadContext.postYield(ThreadContext.java:1269)
        at org.jruby.runtime.ContextAwareBlockBody.post(ContextAwareBlockBody.java:29)
        at org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:198)
        at org.jruby.runtime.Interpreted19Block.call(Interpreted19Block.java:125)
        at org.jruby.runtime.Block.call(Block.java:101)
        at org.jruby.RubyProc.call(RubyProc.java:290)
        at org.jruby.RubyProc.call(RubyProc.java:228)
        at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:99)
        at java.lang.Thread.run(Thread.java:745)
java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "LogStash::Runner" java.lang.OutOfMemoryError: GC overhead limit exceeded
        at org.jruby.RubyString.+(org/jruby/RubyString.java:1174)
        at Time.xmlschema(/usr/local/nagioslogserver/logstash/vendor/jruby/lib/ruby/1.9/time.rb:533)
        at Time.xmlschema(/usr/local/nagioslogserver/logstash/vendor/jruby/lib/ruby/1.9/time.rb:533)
        at LogStash::Timestamp.to_iso8601(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.1-java/lib/logstash/timestamp.rb:89)
        at LogStash::Timestamp.to_iso8601(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.1-java/lib/logstash/timestamp.rb:89)
        at org.elasticsearch.common.xcontent.XContentBuilder.writeValue(org/elasticsearch/common/xcontent/XContentBuilder.java:1272)
        at org.elasticsearch.common.xcontent.XContentBuilder.writeMap(org/elasticsearch/common/xcontent/XContentBuilder.java:1163)
        at org.elasticsearch.common.xcontent.XContentBuilder.map(org/elasticsearch/common/xcontent/XContentBuilder.java:1072)
        at org.elasticsearch.action.index.IndexRequest.source(org/elasticsearch/action/index/IndexRequest.java:379)
        at org.elasticsearch.action.index.IndexRequest.source(org/elasticsearch/action/index/IndexRequest.java:368)
        at java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:606)
Oct 14, 2016 12:30:56 PM org.elasticsearch.client.transport.TransportClientNodesService$SimpleNodeSampler doSample
INFO: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] failed to get node info for [#transport#-1][sa585][inet[localhost/127.0.0.1:9300]], disconnecting...
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[localhost/127.0.0.1:9300]][cluster:monitor/nodes/info] request_id [1057] timed out after [13544ms]
        at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:529)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

Oct 14, 2016 12:30:56 PM org.elasticsearch.transport.netty.NettyTransport exceptionCaught
WARNING: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] exception caught on transport layer [[id: 0x41d0c3b9, /127.0.0.1:35161 => localhost/127.0.0.1:9300]], closing connection
java.lang.OutOfMemoryError: GC overhead limit exceeded

Oct 14, 2016 12:30:56 PM org.elasticsearch.transport.netty.NettyTransport exceptionCaught
WARNING: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] exception caught on transport layer [[id: 0x41d0c3b9, /127.0.0.1:35161 :> localhost/127.0.0.1:9300]], closing connection
java.io.StreamCorruptedException: invalid internal transport message format, got (0,0,0,0)
        at org.elasticsearch.transport.netty.SizeHeaderFrameDecoder.decode(SizeHeaderFrameDecoder.java:47)
        at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
        at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:482)
        at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365)
        at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at org.elasticsearch.common.netty.channel.Channels.fireChannelDisconnected(Channels.java:396)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:360)
        at org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:58)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:574)
        at org.elasticsearch.common.netty.channel.Channels.close(Channels.java:812)
        at org.elasticsearch.common.netty.channel.AbstractChannel.close(AbstractChannel.java:206)
        at org.elasticsearch.transport.netty.NettyTransport.exceptionCaught(NettyTransport.java:611)
        at org.elasticsearch.transport.netty.MessageChannelHandler.exceptionCaught(MessageChannelHandler.java:237)
        at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:112)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.exceptionCaught(FrameDecoder.java:377)
        at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:112)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at org.elasticsearch.common.netty.channel.Channels.fireExceptionCaught(Channels.java:525)
        at org.elasticsearch.common.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:48)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.notifyHandlerException(DefaultChannelPipeline.java:658)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:566)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
        at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
        at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
        at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Any ideas?

Re: logstash dying

Posted: Fri Oct 14, 2016 11:54 am
by rkennedy
The issue is due to running out of memory -

Code: Select all

java.lang.OutOfMemoryError: GC overhead limit exceeded
You'll want to increase the amount of memory on the machine, or reduce the amount of days that you're keeping indexes open on the Backup & Maintenance page. What are your current Backup & Maintenance settings?

Re: logstash dying

Posted: Tue Nov 29, 2016 1:28 pm
by vmesquita
Hello!

Sorry for the late reply. After your message, we realized that one of the nodes had 8 Gb, while the other node had 16 GB. So we assumed that increasing the memory to 16 Gb would probably fix the issue. However now we finally increased the server memory and the issue keeps happening. Here's our Backup and Maintenance config:

Optimize Indexes older than 2 days
Close indexes older than 7 days
Delete indexes older than 0 days
Repository to store backups in
Delete backups older than 720 days
Enable Maintenance and Backups Yes

I am thinking of dropping the "Close index" parameter to 2 just to see if the crashes stop. Then we increase it little by little. Does it make sense?

Re: logstash dying

Posted: Tue Nov 29, 2016 3:32 pm
by mcapra
That makes sense. You might also try increasing the logstash heap size and open files limits by modifying the following values in /etc/init.d/logstash:

Code: Select all

LS_HEAP_SIZE="1000m"
LS_OPEN_FILES=65535

Re: logstash dying

Posted: Wed Dec 28, 2016 1:36 pm
by vmesquita
That did the trick, so far. Thanks.

Re: logstash dying

Posted: Wed Dec 28, 2016 1:45 pm
by mcapra
Awesome! Did you have additional questions regarding this issue, or is it ok if we close this thread and mark the issue as resolved?

Re: logstash dying

Posted: Thu Jan 05, 2017 12:50 pm
by vmesquita
You can close the thread. :)

Re: logstash dying

Posted: Thu Jan 05, 2017 1:27 pm
by rkennedy
Will do! Feel free to make a new one if you have questions in the future.