Page 1 of 1

Database problems after moving backup to new cluster

Posted: Wed Apr 17, 2019 11:23 am
by Jklre
We are getting ready to upgrade Nagios Log server to the latest version and migrate it to a new cluster before decommissioning the old one. I went ahead and installed the latest version and restored a backup of the old database. I then activated the licence key. I am running into a few issues.

1) the database appears to be missing any and all indices

2) Logstash does not appear to be collecting any logs.

Code: Select all

9,740,000
Documents
19
Total Shards
18
Successful Shards
0
Indices
1.1GB
Primary Size
1.2GB
Total Size

Code: Select all

Status	Red
Timed Out?	false
# Instances	2
# Data Instances	2
Active Primary Shards	11
Active Shards	18
Relocating Shards	0
Initializing Shards	1
Unassigned Shards	0

For logstash I try to send some test logs and I get connection refused.

do echo "$i" | nc -u -v unls03lxv.corp.int 514 ; done < sendstuffover
nc: Write error: Connection refused

Thank you.

Re: Database problems after moving backup to new cluster

Posted: Wed Apr 17, 2019 2:28 pm
by cdienger
What steps did you take to restore the database? Are the new nodes members of the old cluster?

To move to a new cluster I generally recommend just adding new nodes one at a time, allowing them to sync, and remove the old nodes - again one at a time and allowing the cluster time to resync after a node's been removed.

One important note is that closed indices will not replicate. You'll need to open closed indices to allow them to replicate. A script to open closed indices to allow them to replicate can be found at https://github.com/elastic/elasticsearch/issues/12963. An example of steps would be:

-add a node
-run the script
-wait for the status to go green
-add second node
-run the script
-wait for the status to go green
-remove old node
-wait for status to go green
-remove second old node
-wait for status to go green

After this the data should be migrated.

Re: Database problems after moving backup to new cluster

Posted: Wed Apr 17, 2019 3:47 pm
by Jklre
Thanks for the response.

I restored the database using the following steps:

sudo /usr/local/nagioslogserver/scripts/restore_backup.sh /store/backups/nagioslogserver/nagioslogserver.2019-04-10.1554937201.tar.gz

A few things about adding the new nodes to the cluster and syncing it that way. They are different versions. the old cluster is Nagios Log Server 1.4.4 and the new cluster is Nagios Log Server 2.0.7

Is there a way to wipe or do a reset on the database on the new nodes before attempting to sync with the existing cluster? I wouldn't want to corrupt any of the nodes that are in production. Ideally we would like to run 2 clusters in tandem before taking down the existing node to verify its stable.

Re: Database problems after moving backup to new cluster

Posted: Wed Apr 17, 2019 4:21 pm
by cdienger
restore_backup.sh will restore the part of the database responsible for holding settings - users, dashboards, etc... The actual log data is stored elsewhere.

Do the 1.4.4 machines have a repo configured(Administration > System > Backup & Maintenance) ? If so, we may be able to ditch the idea of clustering them all together and continue to run two clusters separately, and when you're ready to cut over you can just configure the new cluster to use the same repo and pull in data from the there. There's a bit more to it than that but we can get into that later - we should first determine why the new cluster isn't accepting data. Did the restore script import the logstash config? Port 514 is considered a privileged port and there are some extra steps needed to open it up. See: https://assets.nagios.com/downloads/nag ... Server.pdf.

Please PM me a profile if this doesn't resolve the problem of not receiving data. The profile can be generated under Admin > System > System Status.