Questions about NLS cluster

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

Questions about NLS cluster

Post by WillemDH »

So I'm in the process of installing the second NLS node.

After it is installed, does it matter to what node I'm sending the logs? Would it be worth using a load balancer, Taking into account the extra load on the load balancer? As we have two datacenters and each one will have one NLS, we could just send all logs from one location to that locatios's NLS?

Another question? How does NLS handle split brain situations?

Can I safely reboot NLS nodes? What if the network connection between the datacenters would break temporarily?

Grtz

Willem
Nagios XI 5.8.1
https://outsideit.net
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Questions about NLS cluster

Post by tmcdonald »

WillemDH wrote:After it is installed, does it matter to what node I'm sending the logs?
No sir. Elasticsearch takes care of this on the backend so everything is searchable from any node, and you can send logs to any node. It's really quite cool how they handle it, but I'll spare the gory details for now :)
WillemDH wrote:Would it be worth using a load balancer, Taking into account the extra load on the load balancer? As we have two datacenters and each one will have one NLS, we could just send all logs from one location to that locatios's NLS?
A load balancer can be used, but some of them have this nasty habit of rewriting the source of logs coming through (I believe BanditBBS asked about this as well). As for the load, it's pretty evenly spread out by default. Logs are shared by all nodes ("instances" in most of our documentation) and replicas maintained, so the load is kept as far as possible to only those nodes that have copies of the data.

As for maintaining this across two datacenters, that I am not sure of. You will almost certainly need to have some sort of VPN set up for this.
WillemDH wrote:Another question? How does NLS handle split brain situations?
So there are typically two or more copies of all logs available, spread throughout the cluster. There is a primary and one or more replicas, stored in what are called "shards". These shards are balanced as much as possible throughout the cluster. If one shard goes down you have the other to rely on. As far as making sure the logs get saved properly, I believe a failure status will be returned internally if a node cannot be reached for replication. How this is actually handled (retry? fail gracefully? cry?) I am unsure of.
WillemDH wrote:Can I safely reboot NLS nodes? What if the network connection between the datacenters would break temporarily?
You can, but you want to make sure that you always have enough nodes up in the cluster to handle failover/redundancy and always have at the very least one copy of all your data between the available nodes. In this case, having at least two nodes in each location with 3 replica shards + 1 primary for a total of 4 should help. I believe the default is 1 primary and 1 replica, so this will need to be changed.

This article might explain it better:

http://stackoverflow.com/questions/1569 ... sticsearch
Former Nagios employee
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

Re: Questions about NLS cluster

Post by WillemDH »

Thanks Trevor,

I read through the article and it was really useful information. I guess as once an event is saved, it doesn't change anymore, so split brain situation are maybe not really relevant. As long as the primary and replica shard replicate again when the link is back available.

You can close this thread.

Grtz
Nagios XI 5.8.1
https://outsideit.net
Locked