Page 1 of 1

Nagios Cluster Status RED

Posted: Sun Dec 23, 2018 11:00 pm
by floki
Hi guys, i'm having this problem:
# curl 'localhost:9200/_cluster/health?level=indices&pretty'
{
"cluster_name" : "8e37c562-7430-4a09-95ec-24e7144f3d25",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 137,
"active_shards" : 137,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 135,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,

Note: all indices has yellow or green status

When I look into elasticsearch logs, this is the result:
observer: timeout notification from cluster service. timeout setting [1m], time since start

When I look into logstash logs, this is the result:
"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]', but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection refused (Connection refused)", :class=>"Manticore::SocketException", :level=>:error}

Anybody pleaaaaase help :)

Re: Nagios Cluster Status RED

Posted: Wed Dec 26, 2018 1:51 pm
by cdienger
The messages in the log may not be related and/or temporary - if elasticsearch was down then the curl command querying it wouldn't work at all. What is the full output of the "curl 'localhost:9200/_cluster/health?level=indices&pretty'" command? I've not seen an instance where it would report red but all indices are okay. https://support.nagios.com/kb/article/n ... th-90.html has some more information on the red status and some additional commands to get more info and possibly fix it.

Re: Nagios Cluster Status RED

Posted: Sun Dec 30, 2018 10:50 pm
by floki
The Link you posted really helped me to understand the fundamentals of the architecture of nagios log server. Problem solved, deleted the index that has the red status. Thanks a lot