Safe way to add storage to clustered nodes or reboot nodes?

zbarnett · Post by **zbarnett** » Wed Mar 24, 2021 3:02 pm

When dealing with Nagios Log Server clusters, are there any things I should be aware of when dealing with individual nodes so that the cluster is not negatively affected? For example:

1) When adding storage to an individual node, can I simply perform the usual steps for adding storage and expanding a logical volume in a Linux server, as long as I only touch one node at a time, or are there special steps or considerations I should follow to avoid breaking the clustering?

2) Likewise, when patching and rebooting an individual node, is it safe to do so as long as I only bring down one node at a time and allow it to rejoin the cluster before bringing down the next node, or are there other steps that I should be aware of?

ssax · Post by **ssax** » Thu Mar 25, 2021 2:13 pm

I would make sure that you have snapshots & maintenance setup and backing up your log data to a snapshot repository (and taking Log Server backups) before doing any modifications to the nodes just in case something unexpected occurs, you can see here:

https://support.nagios.com/kb/article.php?id=68

Make sure the cluster health is green before you start doing node maintenance.

Then on the node you're working on I would just stop logstash and elasticsearch before doing anything:
- While not necessary it can help prevent contention issues/etc when you're interacting with the disks

Code: Select all

systemctl stop logstash
systemctl stop elasticsearch

Then do your changes and start them back up after:

Code: Select all

systemctl start logstash
systemctl start elasticsearch

For both 1) and 2) the only thing that I would add is that once you stop the node/stop the services on the node, after you're done and you bring that node back up, make sure you wait until your cluster status turns green before moving onto the next node so everything gets synced up properly.

Post by **cdienger** » Thu Mar 25, 2021 2:21 pm

Welcome to the forums, @zbarnett!

In both instances I would go with working on one node at a time as you've indicated. The only thing I would add would be to to disable shard allocation first and enable it again after you've done the upgrade/maintenance/etc.. The commands to disable/enable shard allocation is found in our upgrade guide at https://assets.nagios.com/downloads/nag ... Server.pdf:

Code: Select all

curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.enable":"none"}}'
curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.enable":"all"}}'

Nagios Support Forum

Safe way to add storage to clustered nodes or reboot nodes?

Safe way to add storage to clustered nodes or reboot nodes?

Re: Safe way to add storage to clustered nodes or reboot nod

Re: Safe way to add storage to clustered nodes or reboot nod