Nagios Log Server is showing RED in it's status

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
srinivasmandalika
Posts: 51
Joined: Thu Oct 20, 2016 4:09 pm

Nagios Log Server is showing RED in it's status

Post by srinivasmandalika »

Hello,

We are facing problems with our Nagios Log Server which are as below...

1. Status of Instance is showing RED

2. All the Shards in Health are showing as unassigned

3. User cannot login into Nagios using URL

Please find screenshots for above as attached...

This all started when our disk space was filled... We increased the disk space and rebooted the machine... But, still the said problems persisted... We followed all the said in the document, https://support.nagios.com/kb/article.php?id=90 ... As we do not have any backups, we did not follow Index Deletion Command and as there are almost 1822 Shards, I am not sure if I can do reassignment command for all at once...

Please let me know if there should be anything else we need to do...

Thanks!

Srinivas Mandalika
You do not have the required permissions to view the files attached to this post.
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Nagios Log Server is showing RED in it's status

Post by mcapra »

Can you send over the contents of your Elasticsearch logs? Please send all files in this path (if possible):

Code: Select all

/var/log/elasticsearch
Can you also share the full outputs of the following commands executed from the CLI of your Nagios Log Server machine:

Code: Select all

df -h
curl -s localhost:9200/_cat/shards
curl 'localhost:9200/_cat/nodes?v'
curl 'localhost:9200/_cat/master?v'
curl -XGET 'http://localhost:9200/_cluster/health/*?level=shards&pretty'
curl -XGET localhost:9200/_nodes/jvm?pretty
Former Nagios employee
https://www.mcapra.com/
srinivasmandalika
Posts: 51
Joined: Thu Oct 20, 2016 4:09 pm

Re: Nagios Log Server is showing RED in it's status

Post by srinivasmandalika »

I am unable to upload files that are above certain space (I think greater than 2 or 3 MB and not more than 3 in number), I have uploaded all I can including the output of the given commands...

Let me know if there is any other method to send you the files...

Let me know if you need any further information...

Thanks!

Srinivas Mandalika
You do not have the required permissions to view the files attached to this post.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Nagios Log Server is showing RED in it's status

Post by dwhitfield »

Although space will still be a concern, you can get them all into one file with tar -zcvf /tmp/supporttar.tar.gz /var/log/elasticsearch.

Assuming that is too large, PM me, and I'll have you email it. If it is too large for email, you can put it on Dropbox or some other file sharing location, we can download, and then you can delete it.
srinivasmandalika
Posts: 51
Joined: Thu Oct 20, 2016 4:09 pm

Re: Nagios Log Server is showing RED in it's status

Post by srinivasmandalika »

I have uploaded it Jumpshare... Please find link as below...

http://jmp.sh/L8cUMN4

Thanks!

Srinivas Mandalika
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Nagios Log Server is showing RED in it's status

Post by mcapra »

It looks like this machine may be running out of space:

Code: Select all

TranslogException[[logstash-2017.05.03][0] failed to create new translog file]; nested: IOException[No space left on device]; ]]
Can you try freeing up some space on this machine or increasing the disk size? To free up some space, you can delete old indices with the following command (Elasticsearch will need to be running):

DISCLAIMER: THIS WILL DELETE SOME DATA AND IT WILL NOT BE RECOVERABLE

Code: Select all

curator delete indices --older-than 60 --time-unit days --timestring %Y.%m.%d
Replace the 60 part of that command with whatever is appropriate for your environment. In the above example, I am deleting any data older than 60 days.
Former Nagios employee
https://www.mcapra.com/
srinivasmandalika
Posts: 51
Joined: Thu Oct 20, 2016 4:09 pm

Re: Nagios Log Server is showing RED in it's status

Post by srinivasmandalika »

I did what you said, but, I still see status as RED...

Srinivas Mandalika
srinivasmandalika
Posts: 51
Joined: Thu Oct 20, 2016 4:09 pm

Re: Nagios Log Server is showing RED in it's status

Post by srinivasmandalika »

Free space is around 138 GB
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Nagios Log Server is showing RED in it's status

Post by mcapra »

Did you restart the Elasticsearch service afterwards on each Nagios Log Server instance?

Can I get fresh outputs of all the following commands:

Code: Select all

service elasticsearch restart
df -h
curl -s localhost:9200/_cat/shards
curl 'localhost:9200/_cat/nodes?v'
curl 'localhost:9200/_cat/master?v'
curl -XGET 'http://localhost:9200/_cluster/health/*?level=shards&pretty'
curl -XGET localhost:9200/_nodes/jvm?pretty
Former Nagios employee
https://www.mcapra.com/
srinivasmandalika
Posts: 51
Joined: Thu Oct 20, 2016 4:09 pm

Re: Nagios Log Server is showing RED in it's status

Post by srinivasmandalika »

Please find output in the attached file...

Srinivas Mandalika
You do not have the required permissions to view the files attached to this post.
Locked