Page 1 of 1

Elasticsearch "queue capacity 1000" errors

Posted: Thu May 28, 2015 10:22 am
by jvestrum
We are trying to ingest fairly a large volume of old logs (several GB) and I'm seeing errors in the elasticsearch log:

Code: Select all

[2015-05-28 10:00:59,818][DEBUG][action.search.type       ] [a6a1ee31-789f-4927-8680-25814f651b54] [logstash-2013.05.27][1], node[ouCBVaMVQB2IA1_D54-7dA], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.search.SearchRequest@7c4b71b8] lastShard [true]
org.elasticsearch.transport.RemoteTransportException: [fd218450-44e4-4ed2-805a-74c1a72a2b63][inet[/169.10.69.98:9300]][search/phase/query]
Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution (queue capacity 1000) on org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler@31f23aab
        at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:62)
        at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
        at org.elasticsearch.transport.netty.MessageChannelHandler.handleRequest(MessageChannelHandler.java:219)
        at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:111)
        at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
        at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
        at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
        at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
        at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
        at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
        at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
        at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Searches appear to still be working correctly in the web interface, but the Administration link returns a totally blank page.

Because we are ingesting a lot of old logs that go back to 2011 and NLS creates an index for each day, we now have over 1400 indicies, so perhaps that has something to do with it.

I have plenty of disk space, only using 44GB of 1TB.

Re: Elasticsearch "queue capacity 1000" errors

Posted: Thu May 28, 2015 1:18 pm
by jvestrum
So, elasticsearch is chugging through the snapshots of all those indexes:

Code: Select all

[2015-05-28 13:07:57,037][INFO ][snapshots                ] [a6a1ee31-789f-4927-8680-25814f651b54] snapshot [NagiosLogServerBackup:logstash-2013.03.12] is done
[2015-05-28 13:08:29,914][INFO ][snapshots                ] [a6a1ee31-789f-4927-8680-25814f651b54] snapshot [NagiosLogServerBackup:logstash-2013.03.13] is done
[2015-05-28 13:09:33,848][INFO ][snapshots                ] [a6a1ee31-789f-4927-8680-25814f651b54] snapshot [NagiosLogServerBackup:logstash-2013.03.14] is done
[2015-05-28 13:09:56,155][INFO ][snapshots                ] [a6a1ee31-789f-4927-8680-25814f651b54] snapshot [NagiosLogServerBackup:logstash-2013.03.15] is done
[2015-05-28 13:10:08,375][INFO ][snapshots                ] [a6a1ee31-789f-4927-8680-25814f651b54] snapshot [NagiosLogServerBackup:logstash-2013.03.16] is done
[2015-05-28 13:10:18,581][INFO ][snapshots                ] [a6a1ee31-789f-4927-8680-25814f651b54] snapshot [NagiosLogServerBackup:logstash-2013.03.17] is done
At this rate it will take a few hours to get through them all. I still can't get to the admin interface but I'll let it run and see what happens once those snapshots are done.

Re: Elasticsearch "queue capacity 1000" errors

Posted: Thu May 28, 2015 4:42 pm
by tgriep
Let us know how this works out for you.

Re: Elasticsearch "queue capacity 1000" errors

Posted: Mon Jun 01, 2015 10:55 am
by jvestrum
After the snapshots competed the "queue capacity 1000" errors stopped. However I still can't get into the Administration pages. I can log into the web interface and do searches, everything else looks okay, but as soon as I click on Administration I get a blank page. Apache is returning a 500 (Internal Server Error).

Re: Elasticsearch "queue capacity 1000" errors

Posted: Mon Jun 01, 2015 11:01 am
by tmcdonald
Definitely a memory limit. Edit your /etc/php.ini like so:

Code: Select all

memory_limit = 512M
I believe it is 128M by default. Save and "service httpd restart" then visit Admin again.

Re: Elasticsearch "queue capacity 1000" errors

Posted: Mon Jun 01, 2015 11:08 am
by jvestrum
Yes, that did the trick. Thanks!

Re: Elasticsearch "queue capacity 1000" errors

Posted: Mon Jun 01, 2015 1:01 pm
by jolson
Glad to hear it! Is this case good to close?

Re: Elasticsearch "queue capacity 1000" errors

Posted: Mon Jun 01, 2015 1:43 pm
by jvestrum
Yes, it's working fine now, it can be closed. Thanks.