Page 1 of 1

Logstash, elasticsearch failure

Posted: Mon Jun 11, 2018 3:15 am
by mfur
Hello,

our logging server (2.0.2) recently stopped accepting logs altogether. Logstash is not working and elasticsearch is also reporting errors. No new logs are available. Tried rebooting (multiple times), closing all indices, restarting logstash/elasticsearch and following this: https://support.nagios.com/kb/article/n ... 0-778.html

Code: Select all

[root@vmd18653 logstash]# cat logstash.log
{:timestamp=>"2018-06-11T09:59:41.521000+0200", :message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-06-11T09:59:41.616000+0200", :message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-06-11T09:59:41.629000+0200", :message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-06-11T09:59:41.642000+0200", :message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}
{:timestamp=>"2018-06-11T09:59:41.653000+0200", :message=>"Pipeline main started"}
{:timestamp=>"2018-06-11T09:59:42.048000+0200", :message=>"Pipeline main has been shutdown"}
{:timestamp=>"2018-06-11T09:59:44.665000+0200", :message=>"stopping pipeline", :id=>"main"}
Probably main issue is with logstash. Elasticsearch is reporting either:

Code: Select all

[2018-06-11 04:00:02,537][DEBUG][action.search.type       ] [c54451f4-aeea-472a-8493-bc0a2b047c83] [logstash-2017.08.25][3], node[fVuhPdqBTcCHbETIl_NVPw], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.search.SearchRequest@5a9cb55e] lastShard [true]
org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution (queue capacity 1000) on org.elasticsearch.search.action.SearchServiceTransportAction$23@62a8a53f
or

Code: Select all

[2018-06-11 09:59:52,985][DEBUG][action.search.type       ] [c54451f4-aeea-472a-8493-bc0a2b047c83] All shards failed for phase: [query]
org.elasticsearch.action.NoShardAvailableActionException: [nagioslogserver][4] null
(after old indices are closed)

Disk space is available. "Java" process belonging to Nagios is using 54% RAM, 5GB still available.

How to approach that? Thank you.

Re: Logstash, elasticsearch failure

Posted: Mon Jun 11, 2018 3:33 am
by mfur
Looks like this is solved - I reapplied Nagios configuration via UI and loggregator is accepting logs again.

Re: Logstash, elasticsearch failure

Posted: Mon Jun 11, 2018 9:37 am
by tmcdonald
Great to hear! Did you have further (related) questions or are we good to lock this up?