Hi,
Suppose, I have 4 node cluster. Due to non-avoidable circumstance, I want to disable data replication for two of them.
Cluster;
Node1+Node2+Node3+Node4
Remove data replication for;
Node2 & Node3 --> I am not taking off these nodes from the cluster. I want to let them be compute nodes in the cluster.
Data replication will be done by only Node1 & Node4.
May I know if this is doable? If yes then can you give me a detailed guide?
Thank you.
Disable data replication in a node
Re: Disable data replication in a node
If I'm understanding you correctly, you want Node2 and Node3 to hold exactly no data but still service searches?technosol wrote: Node2 & Node3 --> I am not taking off these nodes from the cluster. I want to let them be compute nodes in the cluster.
That's not really how ElasticSearch works. You could certainly disable both primary assignment and replication on these nodes and redirect their local Logstash to Node1 and Node4 for storage, and you could even execute queries from Node2 and Node3 in this setup if done correctly, but Node1 and Node4 would still do all of the computing for searches because Node1 and Node4 would hold all the shards and subsequently all the Lucene segments. ElasticSearch distributes work by storing some of those segments on other machines and delegating those other machines to look through "what they have" with respect to a particular request. Think of it like a bunch of filing cabinets and each cabinet can only have exactly one person looking through it. You'd be able to search much faster with 4 cabinets and 4 workers than with 2 cabinets and 2 workers. You cannot have 2 cabinets with 4 workers searching through those cabinets.
You can leverage ingest nodes to effectively serve as "compute nodes" exclusively for the purpose of indexing, but these nodes will have no say in the searches performed.
In my opinion, all of the above is good and well outside the regular functionality of Nagios Log Server.
It is possible, though I don't know of any "detailed guide" for this setup. Probably because there's really no practical benefit to the setup.technosol wrote:May I know if this is doable? If yes then can you give me a detailed guide?
Former Nagios employee
https://www.mcapra.com/
https://www.mcapra.com/
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Disable data replication in a node
Yes, this is very doable, on the 2 "compute nodes" edit /usr/local/nagioslogserver/elasticsearch/config/elasticsearch.ymltechnosol wrote:Hi,
Suppose, I have 4 node cluster. Due to non-avoidable circumstance, I want to disable data replication for two of them.
Cluster;
Node1+Node2+Node3+Node4
Remove data replication for;
Node2 & Node3 --> I am not taking off these nodes from the cluster. I want to let them be compute nodes in the cluster.
Data replication will be done by only Node1 & Node4.
May I know if this is doable? If yes then can you give me a detailed guide?
Thank you.
set
Code: Select all
node.data: false
Code: Select all
service elasticsearch restart
Re: Disable data replication in a node
Hi,
Thank you for the explanation and the guide. I just tested in a 3 node cluster. Configured elasticsearch.yml by disabling node.data.
# Instances 3
# Data Instances 2
also, no impact on receiving logs and search queries.
Thank you for the explanation and the guide. I just tested in a 3 node cluster. Configured elasticsearch.yml by disabling node.data.
# Instances 3
# Data Instances 2
also, no impact on receiving logs and search queries.
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Disable data replication in a node
The only real impact is that you will have slightly less redundancy for self healing and the non-data node doesn't participate in the load while finding data on the cluster because it will send the requests to the nodes containing the shards and then it will perform the aggregations.technosol wrote:Hi,
Thank you for the explanation and the guide. I just tested in a 3 node cluster. Configured elasticsearch.yml by disabling node.data.
# Instances 3
# Data Instances 2
also, no impact on receiving logs and search queries.
Done right it can be very effective.