Disable data replication in a node

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
technosol
Posts: 36
Joined: Mon May 07, 2018 11:46 am

Disable data replication in a node

Post by technosol »

Hi,

Suppose, I have 4 node cluster. Due to non-avoidable circumstance, I want to disable data replication for two of them.

Cluster;

Node1+Node2+Node3+Node4

Remove data replication for;

Node2 & Node3 --> I am not taking off these nodes from the cluster. I want to let them be compute nodes in the cluster.

Data replication will be done by only Node1 & Node4.

May I know if this is doable? If yes then can you give me a detailed guide?

Thank you.
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Disable data replication in a node

Post by mcapra »

technosol wrote: Node2 & Node3 --> I am not taking off these nodes from the cluster. I want to let them be compute nodes in the cluster.
If I'm understanding you correctly, you want Node2 and Node3 to hold exactly no data but still service searches?

That's not really how ElasticSearch works. You could certainly disable both primary assignment and replication on these nodes and redirect their local Logstash to Node1 and Node4 for storage, and you could even execute queries from Node2 and Node3 in this setup if done correctly, but Node1 and Node4 would still do all of the computing for searches because Node1 and Node4 would hold all the shards and subsequently all the Lucene segments. ElasticSearch distributes work by storing some of those segments on other machines and delegating those other machines to look through "what they have" with respect to a particular request. Think of it like a bunch of filing cabinets and each cabinet can only have exactly one person looking through it. You'd be able to search much faster with 4 cabinets and 4 workers than with 2 cabinets and 2 workers. You cannot have 2 cabinets with 4 workers searching through those cabinets.

You can leverage ingest nodes to effectively serve as "compute nodes" exclusively for the purpose of indexing, but these nodes will have no say in the searches performed.

In my opinion, all of the above is good and well outside the regular functionality of Nagios Log Server.
technosol wrote:May I know if this is doable? If yes then can you give me a detailed guide?
It is possible, though I don't know of any "detailed guide" for this setup. Probably because there's really no practical benefit to the setup.
Former Nagios employee
https://www.mcapra.com/
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Disable data replication in a node

Post by scottwilkerson »

technosol wrote:Hi,

Suppose, I have 4 node cluster. Due to non-avoidable circumstance, I want to disable data replication for two of them.

Cluster;

Node1+Node2+Node3+Node4

Remove data replication for;

Node2 & Node3 --> I am not taking off these nodes from the cluster. I want to let them be compute nodes in the cluster.

Data replication will be done by only Node1 & Node4.

May I know if this is doable? If yes then can you give me a detailed guide?

Thank you.
Yes, this is very doable, on the 2 "compute nodes" edit /usr/local/nagioslogserver/elasticsearch/config/elasticsearch.yml

set

Code: Select all

node.data: false
then restart elasticsearch

Code: Select all

service elasticsearch restart
They will come up and can share the work for receiving logs as well as participate in the search load, but do not count against your license requirements.
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
technosol
Posts: 36
Joined: Mon May 07, 2018 11:46 am

Re: Disable data replication in a node

Post by technosol »

Hi,

Thank you for the explanation and the guide. I just tested in a 3 node cluster. Configured elasticsearch.yml by disabling node.data.

# Instances 3
# Data Instances 2

also, no impact on receiving logs and search queries.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Disable data replication in a node

Post by scottwilkerson »

technosol wrote:Hi,

Thank you for the explanation and the guide. I just tested in a 3 node cluster. Configured elasticsearch.yml by disabling node.data.

# Instances 3
# Data Instances 2

also, no impact on receiving logs and search queries.
The only real impact is that you will have slightly less redundancy for self healing and the non-data node doesn't participate in the load while finding data on the cluster because it will send the requests to the nodes containing the shards and then it will perform the aggregations.

Done right it can be very effective.
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked