Logstash failing after 2.2 upgrade

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
Shawn Parr
Posts: 10
Joined: Thu Mar 26, 2015 9:57 am

Logstash failing after 2.2 upgrade

Post by Shawn Parr »

I updated our 2.1 cluster to 2.2 yesterday, and all appeared to be going well at first, although after a couple hours my cluster still hadn't gone back to green, it was yellow after the update. The only step I didn't get to was re-enabling shard allocation since the docs say to wait until the cluster is green before doing that.

This morning I noticed that the cluster had status red, and both nodes had red exclamation points for logstash. I restarted logstash on both nodes, but after a couple minutes it was not running again. So we aren't collecting any logs now, and on the dashboards they don't show any events having come in. I could use some help on this. Don't even really know where to start.
Last edited by Shawn Parr on Tue Sep 15, 2015 7:49 am, edited 1 time in total.
Shawn Parr
Posts: 10
Joined: Thu Mar 26, 2015 9:57 am

Re: Logstash failing after 2.2 upgrade

Post by Shawn Parr »

When tailing the elastic search log, and loading the main page on one of the nodes this is what I get:

[2015-09-15 07:39:38,767][DEBUG][action.search.type ] [15e63c11-7888-4139-8d1d-66b8096a8869] All shards failed for phase: [query]
org.elasticsearch.action.NoShardAvailableActionException: [logstash-2015.09.15][4] null
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:160)
at org.elasticsearch.action.search.type.TransportSearchCountAction.doExecute(TransportSearchCountAction.java:55)
at org.elasticsearch.action.search.type.TransportSearchCountAction.doExecute(TransportSearchCountAction.java:45)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:98)
at org.elasticsearch.client.FilterClient.execute(FilterClient.java:66)
at org.elasticsearch.rest.BaseRestHandler$HeadersAndContextCopyClient.execute(BaseRestHandler.java:92)
at org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:338)
at org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:84)
at org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:53)
at org.elasticsearch.rest.RestController.executeHandler(RestController.java:225)
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:170)
at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:327)
at org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:63)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.http.netty.pipelining.HttpPipeliningHandler.messageReceived(HttpPipeliningHandler.java:60)
at org.elasticsearch.common.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.handler.codec.http.HttpContentDecoder.messageReceived(HttpContentDecoder.java:108)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

In addition my logstash file is pretty much just this, over thousands of lines:

{:timestamp=>"2015-09-15T07:57:03.946000-0500", :message=>"retrying failed action with response code: 503", :level=>:warn}
{:timestamp=>"2015-09-15T07:57:03.946000-0500", :message=>"retrying failed action with response code: 503", :level=>:warn}
{:timestamp=>"2015-09-15T07:57:03.946000-0500", :message=>"retrying failed action with response code: 503", :level=>:warn}
{:timestamp=>"2015-09-15T07:57:03.946000-0500", :message=>"retrying failed action with response code: 503", :level=>:warn}
{:timestamp=>"2015-09-15T07:57:03.946000-0500", :message=>"retrying failed action with response code: 503", :level=>:warn}
{:timestamp=>"2015-09-15T07:57:03.946000-0500", :message=>"retrying failed action with response code: 503", :level=>:warn}
Shawn Parr
Posts: 10
Joined: Thu Mar 26, 2015 9:57 am

Re: Logstash failing after 2.2 upgrade

Post by Shawn Parr »

In a moment of desperation I tried re-enabling shard allocation. That appears to have fixed it. I'm seeing my index grow for today, and my charts are displaying some data again. I just hope I didn't lose about half a days data.

I'm pretty unhappy about this, as the update documentation states the following:
Re-enable shard allocation
After all instances in the cluster have been upgraded and the cluster is in a healthy state (green), re-enable shard allocation – this only needs to be run on a single node:
curl -XPUT localhost:9200/_cluster/settings -d '
{
"transient" : {
"cluster.routing.allocation.enable" : "all"
} }'
I was waiting to see green status before enabling per the provided documentation. However the cluster apparently wouldn't achieve green state until that was done. There is another post here which gave me the indication that enabling shard allocation could help (they were stuck in yellow, enabling helped them get to green). It would definitely be a positive if following the steps in the upgrade guide didn't cause failure.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Logstash failing after 2.2 upgrade

Post by jolson »

I was waiting to see green status before enabling per the provided documentation. However the cluster apparently wouldn't achieve green state until that was done. There is another post here which gave me the indication that enabling shard allocation could help (they were stuck in yellow, enabling helped them get to green). It would definitely be a positive if following the steps in the upgrade guide didn't cause failure.
I will edit the documentation now - I was the technician who added the shard allocation steps to that document in the first place, my intention was to remove the part about the cluster needing to be in green health before re-enabling. You should see my updates reflected soon - I have replaced the 'green health' step with a step that ensures all of your nodes have been joined together appropriately.

Thank you for the feedback - you should see the update reflected soon.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
Locked