Page 1 of 1

red cluster health on working cluster

Posted: Wed May 13, 2015 11:58 am
by benhank
have 2 instances. One is the official VM, . The second is a manual install on a clean machine. Both were installed on centos 6 (clean no updates to the os) with the previous version then upgraded to the latest version of NLS.
My cluster seems to be working. By that I mean we have our environment set up to send logs to server P01 and the data being replicated with T01, which is working. However when I look at the cluster heath its red.
I have

Code: Select all

     service httpd stop
    service logstash stop
    service elasticsearch restart
    service logstash start
    service httpd startstop
I then went into

Code: Select all

Var/log/elasticsearch
and here is the logfile from my secondary machine:
c9dc126e-346d-4bfa-a30e-14b849c50ab5.log
and the primary:
c9dc126e-346d-4bfa-a30e-14b849c50ab5.log
Thanks in advance!

Re: red cluster health on working cluster

Posted: Wed May 13, 2015 12:11 pm
by jolson
This is normally an indication of failing indices. Let's take a look at your shard health. The output isn't pretty, but it gives us a good idea of what's going on:

Code: Select all

curl -XGET 'http://localhost:9200/_cluster/health/*?level=shards'

Re: red cluster health on working cluster

Posted: Wed May 13, 2015 12:24 pm
by benhank
here we go:
Document.rtf

Re: red cluster health on working cluster

Posted: Wed May 13, 2015 12:45 pm
by jolson
It looks like you have a bad index by the name of logstash-2015.05.05. The course of action here will be to remove that index and see if your cluster health recovers. Keep in mind that all log data from that day will be lost - you can always restore if you have a backup present.

Code: Select all

curl -XDELETE 'http://localhost:9200/logstash-2015.05.05/'
How is your cluster health after that removal?

Re: red cluster health on working cluster

Posted: Wed May 13, 2015 12:50 pm
by benhank
lean and green my man thanks! any clue as to how that happened?

better question:
i only deleted that on one machine. shouldn't the other one detect that it was delete it and rebuild it?

Re: red cluster health on working cluster

Posted: Wed May 13, 2015 1:02 pm
by jolson
i only deleted that on one machine. shouldn't the other one detect that it was delete it and rebuild it?
Nope - the index contains both primary and replica shards, so all of the data is now gone unfortunately. The API that we used can affect the whole cluster, not just the node in question.

There are many reasons why an index might corrupt. Some of the more common reasons are disk space filling up or shards being unable initialize properly (for whatever reason). Typically bad shards indicate data loss, so somehow data was likely lost on the server - the culprit is hard to pin down. I would get a backup schedule in place so that you have a way to recover your information if this were to happen again. Protect those bits! :)

Re: red cluster health on working cluster

Posted: Wed May 13, 2015 2:26 pm
by benhank
Thanks for the help and info! all set my man!

Re: red cluster health on working cluster

Posted: Wed May 13, 2015 2:29 pm
by jolson
Glad I could help - I'll close this out. Please feel free to open additional thread if you have further questions or issues. Thanks!