Page 3 of 4

Re: 500 Server error

Posted: Tue Jan 19, 2016 2:22 pm
by jolson
It must be Kibana. Did you save some of the backup files that I had mentioned earlier? The backup files contain your:
dashboards, logstash configs, web GUI configuration, users, and some more information.

We'll need to untar a backup from a known good point in the past and run the following:

Code: Select all

cp /store/backups/nagioslogserver/nagioslogserver.2016-01-14.1452805592.tar.gz ~/
tar zxf nagioslogserver.2016-01-14.1452805592.tar.gz
curl -XDELETE "http://localhost:9200/kibana-int/"
curl -XPOST "http://localhost:9200/kibana-int/_import?path=/root/nagioslogserver.2016-01-14.1452805592/kibana-int.tar.gz"
Be sure to run the XPOST almost immediately after the XDELETE - otherwise the kibana-int database may try to regenerate.

Let me know if the above works out for you, thanks Bandit!

Re: 500 Server error

Posted: Tue Jan 19, 2016 2:27 pm
by BanditBBS
The backups are rotated and only have 5 days worth it seems :( This issue has been going on much longer than that. Let me check with a backup guy and see if we actually back this up onto tape/disk somewhere.

Re: 500 Server error

Posted: Tue Jan 19, 2016 4:31 pm
by BanditBBS
No backups :(

Figured it is clustered, why do backups...apparently there is a flaw in this logic.

Re: 500 Server error

Posted: Tue Jan 19, 2016 6:11 pm
by jolson
Figured it is clustered, why do backups...apparently there is a flaw in this logic.
Unfortunately that is true, backups still need to be taken just in case there is corruption or a hard-crash that happens cluster-wide/a point in time you'd like to revert back to. Similar to how you'd back up a RAID-1 system. I'm trying to think of a way to revert the Kibana database to a working point in time. You could certainly use a default kibana database, but unfortunately that would erase all of your customization (mail settings, logstash configs, etc). Have you created a lot of alerts/dashboards/filters/etc? I'd be happy to get on a remote with you, salvage what we can, and make a new cluster to move the data/configs into.

Email [email protected] and reference this thread - let's set up a 1-2 hour remote next week to get this worked out!

Re: 500 Server error

Posted: Tue Jan 19, 2016 7:45 pm
by BanditBBS
I barely have changed anything, can easily redo all of the setting changes. replacing that DB, does that include any alerts I have setup as well?

And next week? GULP!

Re: 500 Server error

Posted: Wed Jan 20, 2016 12:03 pm
by tmcdonald
BanditBBS wrote:replacing that DB, does that include any alerts I have setup as well?
I would assume so, since that information is stored in the db as well.
BanditBBS wrote:And next week? GULP!
Busy week for us, and our individual calendars fill up fast! We can go into specifics in the ticket.

Re: 500 Server error

Posted: Wed Jan 20, 2016 12:11 pm
by jolson
does that include any alerts I have setup as well?
Correct - the alerts are in the Elasticsearch DB, and can _possibly_ be pulled if you need them to be. I can work with the devs and try to make this happen if you have a high amount of alerts (like Jklre, for instance).

I'm out Thursday and Friday or I would try and schedule the session for that time.

Re: 500 Server error

Posted: Wed Jan 20, 2016 2:06 pm
by BanditBBS
There is just one alert I'll need to recreate. So basically, we can blow this all away and just keep the log data I'll be perfectly fine.

Looks like all my dashboards and queries are magically gone now too. I basically just need to uninstall/reinstall and keep the log data and I'll be good...just need instructions

Still want me to open a ticket?

Re: 500 Server error

Posted: Wed Jan 20, 2016 3:20 pm
by jolson
Nah, no worries - here is the procedure (take a snapshot or backup before trying to make sure we don't lose any data):

Code: Select all

service logstash stop
service elasticsearch stop
cd /usr/local/nagioslogserver/elasticsearch
mv data /somewhere/safe
yum erase httpd
rm -rf /usr/local/nagioslogserver
rm -rf /var/www/html/nagioslogserver
rm /etc/rc.d/init.d/elasticsearch
rm /etc/rc.d/init.d/logstash
rm /etc/rsyslog.d/nagioslogserver.conf
rm /etc/cron.d/nagioslogserver
pip uninstall elasticsearch-curator==1.2.2
service crond restart
Now follow the normal install procedure and click 'new install'. After you have the server setup, do this:

Code: Select all

service elasticsearch stop
cd /usr/local/nagioslogserver/elasticsearch
rm -rf data
mv /somewhere/safe/data .
service elasticsearch start
Verify that you can log in and that everything appears to be working properly.

In the above example where I used /somewhere/safe, choose a directory that won't be removed throughout the uninstall - something like /mnt would be acceptable. Be sure not to move your data to a partition that doesn't have enough space to store it!

Re: 500 Server error

Posted: Wed Jan 20, 2016 5:21 pm
by BanditBBS
That wasnt working, so, I was tired of it not working and I completely rebuilt the cluster. I still have a backup of the data folder from one of the nodes, but I didn't want to risk losing any further data so did the restart. if there is some way to get my old data into the new cluster, great, if not, oh well! Luckily it hasn't been under heavy usage and I can afford to lose what I did(still would like it if possible).