Page 1 of 1

CircuitBreakingException with >30days indexes open

Posted: Thu May 24, 2018 12:53 pm
by vAJ
Tried having more than 30 days of indexes open on our 4-node cluster at the request of our engineering team. Previous 2-node cluster couldn't handle more than 14 days. Really hoping there's something more I can do in perf tweaks on ES to be able to crunch more data on this 4-node cluster.

Code: Select all

[2018-05-24 16:41:02,462][DEBUG][action.search.type       ] [41b94e87-cf97-48ac-a5c6-ed795f9e33f2] All shards failed for phase: [query]
org.elasticsearch.transport.RemoteTransportException: [a1d229b5-12fa-4c5e-8d5b-746ece0d27aa][inet[/10.50.30.107:9300]][indices:data/read/search[phase/query]]
Caused by: org.elasticsearch.ElasticsearchException: org.elasticsearch.common.breaker.CircuitBreakingException: [FIELDDATA] Data too large, data for [host.raw] would be larger than limit of [13450084352/12.5gb]
        at org.elasticsearch.index.fielddata.plain.AbstractIndexFieldData.load(AbstractIndexFieldData.java:80)
        at org.elasticsearch.search.aggregations.support.ValuesSource$MetaData.load(ValuesSource.java:88)
        at org.elasticsearch.search.aggregations.support.AggregationContext.bytesField(AggregationContext.java:180)
        at org.elasticsearch.search.aggregations.support.AggregationContext.valuesSource(AggregationContext.java:143)
        at org.elasticsearch.search.aggregations.support.ValuesSourceAggregatorFactory.create(ValuesSourceAggregatorFactory.java:53)
        at org.elasticsearch.search.aggregations.AggregatorFactories.createAndRegisterContextAware(AggregatorFactories.java:53)
        at org.elasticsearch.search.aggregations.AggregatorFactories.createTopLevelAggregators(AggregatorFactories.java:157)
        at org.elasticsearch.search.aggregations.AggregationPhase.preProcess(AggregationPhase.java:79)
        at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:100)
        at org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:301)
        at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:312)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:776)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:767)
        at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:279)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.common.util.concurrent.UncheckedExecutionException: org.elasticsearch.common.breaker.CircuitBreakingException: [FIELDDATA] Data too large, data for [host.raw] would be larger than limit of [13450084352/12.5gb]
        at org.elasticsearch.common.cache.LocalCache$Segment.get(LocalCache.java:2203)
        at org.elasticsearch.common.cache.LocalCache.get(LocalCache.java:3937)
        at org.elasticsearch.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4739)
        at org.elasticsearch.indices.fielddata.cache.IndicesFieldDataCache$IndexFieldCache.load(IndicesFieldDataCache.java:167)
        at org.elasticsearch.index.fielddata.plain.AbstractIndexFieldData.load(AbstractIndexFieldData.java:74)
        ... 17 more
Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: [FIELDDATA] Data too large, data for [host.raw] would be larger than limit of [13450084352/12.5gb]
        at org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.circuitBreak(ChildMemoryCircuitBreaker.java:97)
        at org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMemoryCircuitBreaker.java:148)
        at org.elasticsearch.index.fielddata.plain.PagedBytesIndexFieldData$PagedBytesEstimator.beforeLoad(PagedBytesIndexFieldData.java:217)
        at org.elasticsearch.index.fielddata.plain.PagedBytesIndexFieldData.loadDirect(PagedBytesIndexFieldData.java:89)
        at org.elasticsearch.index.fielddata.plain.PagedBytesIndexFieldData.loadDirect(PagedBytesIndexFieldData.java:43)
        at org.elasticsearch.indices.fielddata.cache.IndicesFieldDataCache$IndexFieldCache$1.call(IndicesFieldDataCache.java:180)
        at org.elasticsearch.indices.fielddata.cache.IndicesFieldDataCache$IndexFieldCache$1.call(IndicesFieldDataCache.java:167)
        at org.elasticsearch.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4742)
        at org.elasticsearch.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
        at org.elasticsearch.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319)
        at org.elasticsearch.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
        at org.elasticsearch.common.cache.LocalCache$Segment.get(LocalCache.java:2197)
        ... 21 more

Code: Select all

Date       # Docs	      Index Size (GB)
2018.04.24	80,123,435	23.80
2018.04.25	79,154,864	23.30
2018.04.26	75,573,485	21.90
2018.04.27	72,603,365	20.90
2018.04.28	13,752,469	3.90
2018.04.29	11,210,996	3.30
2018.04.30	34,017,591	12.00
2018.05.01	30,140,906	10.70
2018.05.02	27,746,199	9.80
2018.05.03	21,758,601	7.60
2018.05.04	31,733,987	12.10
2018.05.05	8,342,020	2.90
2018.05.06	8,754,301	3.10
2018.05.07	39,043,424	15.10
2018.05.08	37,447,506	14.40
2018.05.09	36,286,876	13.70
2018.05.10	39,211,018	14.60
2018.05.11	34,709,762	12.40
2018.05.12	11,404,047	3.50
2018.05.13	11,151,603	3.30
2018.05.14	44,699,775	16.70
2018.05.15	46,745,086	17.50
2018.05.16	33,705,539	12.10
2018.05.17	30,380,948	10.90
2018.05.18	27,627,349	10.10
2018.05.19	8,345,161 	2.80
2018.05.20	7,643,582 	2.60
2018.05.21	36,876,143	13.70
2018.05.22	36,335,297	13.20
2018.05.23	37,057,332	13.10
2018.05.24	17,505,744	 6.60
Nodes are all 8cpu/64GB

Re: CircuitBreakingException with >30days indexes open

Posted: Thu May 24, 2018 8:38 pm
by scottwilkerson
You should be able to counter this by setting the following in
/usr/local/nagioslogserver/elasticsearch/config/elasticsearch.yml
and then restarting elasticsearch on each server

Code: Select all

indices.fielddata.cache.size:  20%
reference: https://www.elastic.co/guide/en/elastic ... usage.html

Re: CircuitBreakingException with >30days indexes open

Posted: Wed May 30, 2018 11:55 am
by vAJ
Thanks, Scott!

Checking my config, I realized that this setting was only on one node (we recently rebuilt the cluster with 3 other new servers). Getting that set across the board and restarting appears to be allowing for large search parameters. Still a little slow in building the charts, but that's livable.

I'll keep an eye on it, but we can close this thread for now.

Eagerly awaiting your next release... ;)

Re: CircuitBreakingException with >30days indexes open

Posted: Wed May 30, 2018 12:10 pm
by scottwilkerson
vAJ wrote:Thanks, Scott!

Checking my config, I realized that this setting was only on one node (we recently rebuilt the cluster with 3 other new servers). Getting that set across the board and restarting appears to be allowing for large search parameters. Still a little slow in building the charts, but that's livable.

I'll keep an eye on it, but we can close this thread for now.

Eagerly awaiting your next release... ;)
Excellent! Glad to be of assistance.