logstash dying

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
vmesquita
Posts: 315
Joined: Fri Aug 10, 2012 12:52 pm

logstash dying

Post by vmesquita »

The logstash process keeps dying in one of the nodes, I restart it and after a while it dies again. This time I left the console open and got those messages:

Code: Select all

Oct 14, 2016 12:08:34 PM org.elasticsearch.plugins.PluginsService <init>
INFO: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] loaded [], sites []
Oct 14, 2016 12:08:37 PM org.elasticsearch.plugins.PluginsService <init>
INFO: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] loaded [], sites []
Oct 14, 2016 12:08:37 PM org.elasticsearch.plugins.PluginsService <init>
INFO: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] loaded [], sites []
Oct 14, 2016 12:08:37 PM org.elasticsearch.plugins.PluginsService <init>
INFO: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] loaded [], sites []
Oct 14, 2016 12:08:37 PM org.elasticsearch.plugins.PluginsService <init>
INFO: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] loaded [], sites []
Exception in thread "Ruby-0-Thread-46: /usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.2.8-java/lib/logstash/outputs/elasticsearch.rb:406" Exception in thread "elasticsearch[6a7ce4ea-e1b9-47a1-af18-1c4d47243d20][generic][T#1]" Exception in thread "Ruby-0-Thread-39: /usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.2.8-java/lib/logstash/outputs/elasticsearch.rb:406" Oct 14, 2016 12:30:54 PM org.elasticsearch.client.transport.TransportClientNodesService$SimpleNodeSampler doSample
INFO: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] failed to get node info for [#transport#-1][sa585][inet[localhost/127.0.0.1:9300]], disconnecting...
java.lang.OutOfMemoryError: GC overhead limit exceeded

java.lang.ArrayIndexOutOfBoundsException: -1
        at org.jruby.runtime.ThreadContext.popRubyClass(ThreadContext.java:702)
        at org.jruby.runtime.ThreadContext.postYield(ThreadContext.java:1269)
        at org.jruby.runtime.ContextAwareBlockBody.post(ContextAwareBlockBody.java:29)
        at org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:198)
        at org.jruby.runtime.Interpreted19Block.call(Interpreted19Block.java:125)
        at org.jruby.runtime.Block.call(Block.java:101)
        at org.jruby.RubyProc.call(RubyProc.java:290)
        at org.jruby.RubyProc.call(RubyProc.java:228)
        at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:99)
        at java.lang.Thread.run(Thread.java:745)
java.lang.ArrayIndexOutOfBoundsException: -1
        at org.jruby.runtime.ThreadContext.popRubyClass(ThreadContext.java:702)
        at org.jruby.runtime.ThreadContext.postYield(ThreadContext.java:1269)
        at org.jruby.runtime.ContextAwareBlockBody.post(ContextAwareBlockBody.java:29)
        at org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:198)
        at org.jruby.runtime.Interpreted19Block.call(Interpreted19Block.java:125)
        at org.jruby.runtime.Block.call(Block.java:101)
        at org.jruby.RubyProc.call(RubyProc.java:290)
        at org.jruby.RubyProc.call(RubyProc.java:228)
        at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:99)
        at java.lang.Thread.run(Thread.java:745)
java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "LogStash::Runner" java.lang.OutOfMemoryError: GC overhead limit exceeded
        at org.jruby.RubyString.+(org/jruby/RubyString.java:1174)
        at Time.xmlschema(/usr/local/nagioslogserver/logstash/vendor/jruby/lib/ruby/1.9/time.rb:533)
        at Time.xmlschema(/usr/local/nagioslogserver/logstash/vendor/jruby/lib/ruby/1.9/time.rb:533)
        at LogStash::Timestamp.to_iso8601(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.1-java/lib/logstash/timestamp.rb:89)
        at LogStash::Timestamp.to_iso8601(/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.1-java/lib/logstash/timestamp.rb:89)
        at org.elasticsearch.common.xcontent.XContentBuilder.writeValue(org/elasticsearch/common/xcontent/XContentBuilder.java:1272)
        at org.elasticsearch.common.xcontent.XContentBuilder.writeMap(org/elasticsearch/common/xcontent/XContentBuilder.java:1163)
        at org.elasticsearch.common.xcontent.XContentBuilder.map(org/elasticsearch/common/xcontent/XContentBuilder.java:1072)
        at org.elasticsearch.action.index.IndexRequest.source(org/elasticsearch/action/index/IndexRequest.java:379)
        at org.elasticsearch.action.index.IndexRequest.source(org/elasticsearch/action/index/IndexRequest.java:368)
        at java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:606)
Oct 14, 2016 12:30:56 PM org.elasticsearch.client.transport.TransportClientNodesService$SimpleNodeSampler doSample
INFO: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] failed to get node info for [#transport#-1][sa585][inet[localhost/127.0.0.1:9300]], disconnecting...
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[localhost/127.0.0.1:9300]][cluster:monitor/nodes/info] request_id [1057] timed out after [13544ms]
        at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:529)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

Oct 14, 2016 12:30:56 PM org.elasticsearch.transport.netty.NettyTransport exceptionCaught
WARNING: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] exception caught on transport layer [[id: 0x41d0c3b9, /127.0.0.1:35161 => localhost/127.0.0.1:9300]], closing connection
java.lang.OutOfMemoryError: GC overhead limit exceeded

Oct 14, 2016 12:30:56 PM org.elasticsearch.transport.netty.NettyTransport exceptionCaught
WARNING: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] exception caught on transport layer [[id: 0x41d0c3b9, /127.0.0.1:35161 :> localhost/127.0.0.1:9300]], closing connection
java.io.StreamCorruptedException: invalid internal transport message format, got (0,0,0,0)
        at org.elasticsearch.transport.netty.SizeHeaderFrameDecoder.decode(SizeHeaderFrameDecoder.java:47)
        at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
        at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:482)
        at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365)
        at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at org.elasticsearch.common.netty.channel.Channels.fireChannelDisconnected(Channels.java:396)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:360)
        at org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:58)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:574)
        at org.elasticsearch.common.netty.channel.Channels.close(Channels.java:812)
        at org.elasticsearch.common.netty.channel.AbstractChannel.close(AbstractChannel.java:206)
        at org.elasticsearch.transport.netty.NettyTransport.exceptionCaught(NettyTransport.java:611)
        at org.elasticsearch.transport.netty.MessageChannelHandler.exceptionCaught(MessageChannelHandler.java:237)
        at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:112)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.exceptionCaught(FrameDecoder.java:377)
        at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:112)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at org.elasticsearch.common.netty.channel.Channels.fireExceptionCaught(Channels.java:525)
        at org.elasticsearch.common.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:48)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.notifyHandlerException(DefaultChannelPipeline.java:658)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:566)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
        at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
        at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
        at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Any ideas?
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: logstash dying

Post by rkennedy »

The issue is due to running out of memory -

Code: Select all

java.lang.OutOfMemoryError: GC overhead limit exceeded
You'll want to increase the amount of memory on the machine, or reduce the amount of days that you're keeping indexes open on the Backup & Maintenance page. What are your current Backup & Maintenance settings?
Former Nagios Employee
vmesquita
Posts: 315
Joined: Fri Aug 10, 2012 12:52 pm

Re: logstash dying

Post by vmesquita »

Hello!

Sorry for the late reply. After your message, we realized that one of the nodes had 8 Gb, while the other node had 16 GB. So we assumed that increasing the memory to 16 Gb would probably fix the issue. However now we finally increased the server memory and the issue keeps happening. Here's our Backup and Maintenance config:

Optimize Indexes older than 2 days
Close indexes older than 7 days
Delete indexes older than 0 days
Repository to store backups in
Delete backups older than 720 days
Enable Maintenance and Backups Yes

I am thinking of dropping the "Close index" parameter to 2 just to see if the crashes stop. Then we increase it little by little. Does it make sense?
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: logstash dying

Post by mcapra »

That makes sense. You might also try increasing the logstash heap size and open files limits by modifying the following values in /etc/init.d/logstash:

Code: Select all

LS_HEAP_SIZE="1000m"
LS_OPEN_FILES=65535
Former Nagios employee
https://www.mcapra.com/
vmesquita
Posts: 315
Joined: Fri Aug 10, 2012 12:52 pm

Re: logstash dying

Post by vmesquita »

That did the trick, so far. Thanks.
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: logstash dying

Post by mcapra »

Awesome! Did you have additional questions regarding this issue, or is it ok if we close this thread and mark the issue as resolved?
Former Nagios employee
https://www.mcapra.com/
vmesquita
Posts: 315
Joined: Fri Aug 10, 2012 12:52 pm

Re: logstash dying

Post by vmesquita »

You can close the thread. :)
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: logstash dying

Post by rkennedy »

Will do! Feel free to make a new one if you have questions in the future.
Former Nagios Employee
Locked