Elasticsearch "queue capacity 1000" errors

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
jvestrum
Posts: 46
Joined: Tue Mar 03, 2015 10:45 am

Elasticsearch "queue capacity 1000" errors

Post by jvestrum »

We are trying to ingest fairly a large volume of old logs (several GB) and I'm seeing errors in the elasticsearch log:

Code: Select all

[2015-05-28 10:00:59,818][DEBUG][action.search.type       ] [a6a1ee31-789f-4927-8680-25814f651b54] [logstash-2013.05.27][1], node[ouCBVaMVQB2IA1_D54-7dA], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.search.SearchRequest@7c4b71b8] lastShard [true]
org.elasticsearch.transport.RemoteTransportException: [fd218450-44e4-4ed2-805a-74c1a72a2b63][inet[/169.10.69.98:9300]][search/phase/query]
Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution (queue capacity 1000) on org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler@31f23aab
        at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:62)
        at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
        at org.elasticsearch.transport.netty.MessageChannelHandler.handleRequest(MessageChannelHandler.java:219)
        at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:111)
        at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
        at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
        at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
        at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
        at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
        at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
        at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
        at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Searches appear to still be working correctly in the web interface, but the Administration link returns a totally blank page.

Because we are ingesting a lot of old logs that go back to 2011 and NLS creates an index for each day, we now have over 1400 indicies, so perhaps that has something to do with it.

I have plenty of disk space, only using 44GB of 1TB.
jvestrum
Posts: 46
Joined: Tue Mar 03, 2015 10:45 am

Re: Elasticsearch "queue capacity 1000" errors

Post by jvestrum »

So, elasticsearch is chugging through the snapshots of all those indexes:

Code: Select all

[2015-05-28 13:07:57,037][INFO ][snapshots                ] [a6a1ee31-789f-4927-8680-25814f651b54] snapshot [NagiosLogServerBackup:logstash-2013.03.12] is done
[2015-05-28 13:08:29,914][INFO ][snapshots                ] [a6a1ee31-789f-4927-8680-25814f651b54] snapshot [NagiosLogServerBackup:logstash-2013.03.13] is done
[2015-05-28 13:09:33,848][INFO ][snapshots                ] [a6a1ee31-789f-4927-8680-25814f651b54] snapshot [NagiosLogServerBackup:logstash-2013.03.14] is done
[2015-05-28 13:09:56,155][INFO ][snapshots                ] [a6a1ee31-789f-4927-8680-25814f651b54] snapshot [NagiosLogServerBackup:logstash-2013.03.15] is done
[2015-05-28 13:10:08,375][INFO ][snapshots                ] [a6a1ee31-789f-4927-8680-25814f651b54] snapshot [NagiosLogServerBackup:logstash-2013.03.16] is done
[2015-05-28 13:10:18,581][INFO ][snapshots                ] [a6a1ee31-789f-4927-8680-25814f651b54] snapshot [NagiosLogServerBackup:logstash-2013.03.17] is done
At this rate it will take a few hours to get through them all. I still can't get to the admin interface but I'll let it run and see what happens once those snapshots are done.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Elasticsearch "queue capacity 1000" errors

Post by tgriep »

Let us know how this works out for you.
Be sure to check out our Knowledgebase for helpful articles and solutions!
jvestrum
Posts: 46
Joined: Tue Mar 03, 2015 10:45 am

Re: Elasticsearch "queue capacity 1000" errors

Post by jvestrum »

After the snapshots competed the "queue capacity 1000" errors stopped. However I still can't get into the Administration pages. I can log into the web interface and do searches, everything else looks okay, but as soon as I click on Administration I get a blank page. Apache is returning a 500 (Internal Server Error).
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Elasticsearch "queue capacity 1000" errors

Post by tmcdonald »

Definitely a memory limit. Edit your /etc/php.ini like so:

Code: Select all

memory_limit = 512M
I believe it is 128M by default. Save and "service httpd restart" then visit Admin again.
Former Nagios employee
jvestrum
Posts: 46
Joined: Tue Mar 03, 2015 10:45 am

Re: Elasticsearch "queue capacity 1000" errors

Post by jvestrum »

Yes, that did the trick. Thanks!
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Elasticsearch "queue capacity 1000" errors

Post by jolson »

Glad to hear it! Is this case good to close?
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
jvestrum
Posts: 46
Joined: Tue Mar 03, 2015 10:45 am

Re: Elasticsearch "queue capacity 1000" errors

Post by jvestrum »

Yes, it's working fine now, it can be closed. Thanks.
Locked