Page 1 of 1

lost data disk

Posted: Mon Apr 27, 2020 2:26 pm
by dariusz.nalazek
Hello,

I just lost one disk on my server (human mistake, unrecoverable data):
/usr/local/nagioslogserver/elasticsearch/data

New, and healthy disk is now connected to data directory.
All other parts of system are OK, how to recover after such failure?

Remove node from cluster? then uninstall LS and then reinstall LS?
(remove node like(?): https://assets.nagios.com/downloads/nag ... luster.pdf)
(uninstall node like(?): https://support.nagios.com/forum/viewto ... 37&t=43624)

Or reinstall full server?

Or some smart solution? (eg. just copy some data to force node to start replication?)

Darek.

Re: lost data disk

Posted: Mon Apr 27, 2020 3:02 pm
by cdienger
Is elasticsearch back up and running? It should take care of replicating the data if necessary.

Do you see all nodes in the cluster if you run:

Code: Select all

curl 'localhost:9200/_cat/nodes?v'
Also run the following to check the indices' health:

Code: Select all

curl 'localhost:9200/_cat/indices?pretty'
and shard health:

Code: Select all

curl 'localhost:9200/_cluster/health/*?level=shards&pretty'

Re: lost data disk

Posted: Tue Apr 28, 2020 4:46 am
by dariusz.nalazek
Thanks for tip :)
Was enough to find source what is wrong :)

This time, was no proper access rights to disk, after replacement.
Service was starting well but no reponse from :9200
After fixing access rights and restarting service all is OK.

Darek.

Re: lost data disk

Posted: Tue Apr 28, 2020 2:33 pm
by ssax
That's great to hear, I'm glad you got it fixed! Are we okay to lock this and mark it as resolved?