Hi, My logstash is abruptly stopping often with below error. Even though all nodes are up and running.
{:timestamp=>"2017-08-02T09:21:57.769000-0700", :message=>"Got error to send bulk of actions: None of the configured nodes are available: []", :level=>:error}
{:timestamp=>"2017-08-02T09:21:57.769000-0700", :message=>"Failed to flush outgoing items", :outgoing_count=>337, :exception=>org.elasticsearch.client.transport.NoNodeAvailableException: None of the configured nodes are available: [], :backtrace=>["org.elasticsearch.client.transport.TransportClientNodesService.ensureNodesAreAvailable(org/elasticsearch/client/transport/TransportClientNodesService.java:279)", "org.elasticsearch.client.transport.TransportClientNodesService.execute(org/elasticsearch/client/transport/TransportClientNodesService.java:198)",
Below is the result for top-bcn1
top - 10:07:32 up 142 days, 11 min, 2 users, load average: 1.12, 0.59, 0.45
Tasks: 121 total, 1 running, 120 sleeping, 0 stopped, 0 zombie
Cpu(s): 3.0%us, 0.6%sy, 1.5%ni, 94.5%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8193024k total, 8067724k used, 125300k free, 18320k buffers
Swap: 262136k total, 9656k used, 252480k free, 2327556k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
14362 nagios 39 19 2294m 704m 14m S 168.8 8.8 2:19.31 /usr/bin/java -Djava.io.tmpdir=/usr/local/nagioslogserver/tmp -Djava.io.tmpdir=/usr/local/nagioslogserver/tmp -Xmx500m -Xss2048k -Djffi.boot.library.path=/app/nagioslogserver/
22296 apache 20 0 335m 12m 3176 S 4.0 0.2 0:04.68 /usr/sbin/httpd
22321 apache 20 0 334m 11m 3120 S 4.0 0.1 0:04.56 /usr/sbin/httpd
21966 nagios 20 0 19.4g 5.0g 709m S 2.0 64.4 73:19.55 /usr/bin/java -Xms4000m -Xmx4000m -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX
1 root 20 0 19232 1252 1056 S 0.0 0.0 0:00.88 /sbin/init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kthreadd]
Logstash stopping abruptly
Re: Logstash stopping abruptly
It would appear as though Logstash is unable to talk to Elasticsearch for some period of time.
Can you share your Elasticsearch logs? They can typically be found here:
Can you share your Elasticsearch logs? They can typically be found here:
Code: Select all
/var/log/elasticsearch/*.logFormer Nagios employee
https://www.mcapra.com/
https://www.mcapra.com/
Re: Logstash stopping abruptly
Code: Select all
[2017-08-02 13:29:52,667][DEBUG][action.search.type ] [abd0aca5-8cbf-4f11-988e-be0d778f5f95] All shards failed for phase: [query]
org.elasticsearch.transport.RemoteTransportException: [a40fda6a-2269-44c8-9c95-77eaf5a865dd][inet[/136.133.238.46:9300]][indices:data/read/search[phase/query]]
Caused by: org.elasticsearch.search.SearchParseException: [logstash-2017.08.01][3]: from[-1],size[-1]: Parse Failure [Failed to parse source [{"facets":{"0":{"date_histogram":{"field":"@timestamp","interval":"10m"},"global":true,"facet_f
ilter":{"fquery":{"query":{"filtered":{"query":{"query_string":{"query":"PartsView\/PartsView"}},"filter":{"bool":{"must":[{"range":{"@timestamp":{"from":1501619394749,"to":1501705794749}}}],"must_not":[{"terms":{"host.raw":["136.133.230
.76"]}},{"terms":{"host.raw":["136.133.230.204"]}},{"terms":{"host.raw":["136.133.230.207"]}},{"terms":{"host.raw":["136.133.230.180"]}},{"terms":{"host.raw":["136.133.231.113"]}},{"terms":{"host.raw":["136.133.230.205"]}},{"terms":{"hos
t.raw":["136.133.230.182"]}},{"terms":{"host.raw":["136.133.231.111"]}},{"terms":{"host.raw":["0:0:0:0:0:0:0:1"]}},{"terms":{"host.raw":["136.133.230.134"]}},{"terms":{"host.raw":["136.133.230.200"]}},{"terms":{"host.raw":["136.133.236.1
47"]}},{"terms":{"host.raw":["136.133.236.147"]}},{"terms":{"host.raw":["136.133.236.147"]}},{"terms":{"host.raw":["136.133.230.221"]}},{"terms":{"host.raw":["136.133.231.211"]}},{"terms":{"host.raw":["136.133.231.213"]}},{"terms":{"host
.raw":["136.133.230.192"]}},{"terms":{"host.raw":["136.133.230.235"]}},{"terms":{"host.raw":["136.133.230.238"]}},{"terms":{"host.raw":["136.133.236.203"]}},{"terms":{"host.raw":["136.133.230.239"]}},{"terms":{"host.raw":["136.133.230.24
6"]}},{"terms":{"host":["136.133.230.139"]}},{"terms":{"host":["136.133.230.139"]}},{"terms":{"host.raw":["136.133.230.143"]}},{"terms":{"host.raw":["136.133.230.143"]}},{"terms":{"host.raw":["136.133.230.223"]}},{"terms":{"host.raw":["1
36.133.230.159"]}},{"terms":{"host.raw":["136.133.230.159"]}},{"terms":{"host.raw":["136.133.230.154"]}},{"terms":{"host.raw":["136.133.230.191"]}},{"terms":{"host.raw":["136.133.231.59"]}},{"terms":{"host.raw":["136.133.231.93"]}},{"ter
ms":{"host.raw":["136.133.236.166"]}},{"terms":{"host.raw":["136.133.236.62"]}},{"terms":{"host.raw":["136.133.236.204"]}},{"terms":{"host.raw":["136.133.236.149"]}},{"terms":{"host.raw":["136.133.231.239"]}},{"terms":{"host.raw":["136.1
33.230.194"]}},{"terms":{"host.raw":["136.133.231.5"]}},{"terms":{"host.raw":["136.133.230.4"]}},{"terms":{"host.raw":["136.133.230.6"]}},{"terms":{"host.raw":["136.133.230.6"]}},{"terms":{"host.raw":["136.133.230.5"]}},{"terms":{"host.r
aw":["136.133.230.5"]}},{"terms":{"host.raw":["136.133.175.247"]}},{"terms":{"host.raw":["136.133.175.248"]}},{"terms":{"host.raw":["136.133.175.248"]}},{"terms":{"host.raw":["136.133.131.249"]}},{"terms":{"host.raw":["136.133.131.249"]}
},{"terms":{"host.raw":["136.133.131.249"]}},{"terms":{"host.raw":["136.133.24.249"]}},{"terms":{"host.raw":["136.133.24.249"]}},{"terms":{"host.raw":["136.133.171.8"]}},{"terms":{"host.raw":["136.133.171.8"]}},{"terms":{"host.raw":["136
.133.160.249"]}},{"terms":{"host.raw":["136.133.151.249"]}},{"terms":{"host.raw":["136.133.133.31"]}},{"terms":{"host.raw":["136.133.117.249"]}},{"terms":{"host.raw":["136.133.141.249"]}}]}}}}}}}},"size":0}]]
at org.elasticsearch.search.SearchService.parseSource(SearchService.java:735)
at org.elasticsearch.search.SearchService.createContext(SearchService.java:560)
at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:532)
at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:294)
at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:776)
at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:767)
at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:279)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.index.query.QueryParsingException: [logstash-2017.08.01] Failed to parse query [PartsView/PartsView]
at org.elasticsearch.index.query.QueryStringQueryParser.parse(QueryStringQueryParser.java:250)
at org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:302)
at org.elasticsearch.index.query.FilteredQueryParser.parse(FilteredQueryParser.java:71)
at org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:302)
at org.elasticsearch.index.query.FQueryFilterParser.parse(FQueryFilterParser.java:66)
at org.elasticsearch.index.query.QueryParseContext.executeFilterParser(QueryParseContext.java:368)
at org.elasticsearch.index.query.QueryParseContext.parseInnerFilter(QueryParseContext.java:349)
at org.elasticsearch.index.query.IndexQueryParserService.parseInnerFilter(IndexQueryParserService.java:295)
at org.elasticsearch.search.facet.FacetParseElement.parse(FacetParseElement.java:86)
at org.elasticsearch.search.SearchService.parseSource(SearchService.java:719)
... 10 more
Caused by: org.apache.lucene.queryparser.classic.ParseException: Cannot parse 'PartsView/PartsView': Lexical error at line 1, column 20. Encountered: <EOF> after : "/PartsView"
at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:137)
at org.apache.lucene.queryparser.classic.MapperQueryParser.parse(MapperQueryParser.java:891)
at org.elasticsearch.index.query.QueryStringQueryParser.parse(QueryStringQueryParser.java:233)
... 19 more
Caused by: org.apache.lucene.queryparser.classic.TokenMgrError: Lexical error at line 1, column 20. Encountered: <EOF> after : "/PartsView"
at org.apache.lucene.queryparser.classic.QueryParserTokenManager.getNextToken(QueryParserTokenManager.java:1133)
at org.apache.lucene.queryparser.classic.QueryParser.jj_scan_token(QueryParser.java:601)
at org.apache.lucene.queryparser.classic.QueryParser.jj_3R_2(QueryParser.java:484)
at org.apache.lucene.queryparser.classic.QueryParser.jj_3_1(QueryParser.java:491)
at org.apache.lucene.queryparser.classic.QueryParser.jj_2_1(QueryParser.java:477)
at org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:228)
at org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:183)
at org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser.java:172)
at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:127)
Last edited by tmcdonald on Wed Aug 02, 2017 4:59 pm, edited 1 time in total.
Reason: Please use [code][/code] tags around long output
Reason: Please use [code][/code] tags around long output
Re: Logstash stopping abruptly
Totally I have 4 nodes but however when I check the nodes with below command on each server, I get only 2 nodes displayed but all 4nodes have same cluster ID
curl -XGET localhost:9200/_nodes/jvm?pretty
Looks like all 4 nodes are not communicating each other only 2 nodes are in sync.
curl -XGET localhost:9200/_nodes/jvm?pretty
Looks like all 4 nodes are not communicating each other only 2 nodes are in sync.
- tacolover101
- Posts: 432
- Joined: Mon Apr 10, 2017 11:55 am
Re: Logstash stopping abruptly
ruh roh, it looks like you may have hit a split brain, which is where they both start running solo on their own. can you upload a NLS profile from your different machines?
something to help prevent this in the future so that neither lost party can elect a master, is perhaps run your cluster with an odd number, such as 3 or 5. then, set your min. master nodes to 2, or 3 - this will allow you to take down machines for maint. and upgrade as needed. elastic does a great job at explaining it here - https://www.elastic.co/guide/en/elastic ... r-election
something to help prevent this in the future so that neither lost party can elect a master, is perhaps run your cluster with an odd number, such as 3 or 5. then, set your min. master nodes to 2, or 3 - this will allow you to take down machines for maint. and upgrade as needed. elastic does a great job at explaining it here - https://www.elastic.co/guide/en/elastic ... r-election
Re: Logstash stopping abruptly
Thanks for the assist, @tacolover101! OP, let us know if you need further assistance.
Former Nagios employee
Re: Logstash stopping abruptly
I increased watermark level to 90% and now my server is collecting logs.
I execute the below command and I see that 4nodes are displayed
curl -XGET localhost:9200/_nodes/jvm?pretty
Can you help me with the reason for this split brain appearance.
I execute the below command and I see that 4nodes are displayed
curl -XGET localhost:9200/_nodes/jvm?pretty
Can you help me with the reason for this split brain appearance.
Re: Logstash stopping abruptly
We would need the complete historical logs during the event from each node to be able to say for sure what caused it. Typically, it's either network issues or some instability on one or many nodes that causes them to crash.
Former Nagios employee
https://www.mcapra.com/
https://www.mcapra.com/
-
dwhitfield
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: Logstash stopping abruptly
Can you PM me the two profiles?
After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.
After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.