Page 1 of 1

Cluster with different hardware

Posted: Wed Mar 07, 2018 1:37 pm
by ssoliveira
Hello guys,

Good afternoon,

We are working on the extension of our NLS cluster.

We currently have 4 nodes, and we are planning to increase to 20 or 40 nodes.

Currently the infrastructure is physical.

We have several equipment available for reuse, various models, various processor speeds, etc.

One option is to install ESXi on these devices, and rebuild the entire solution on VMs.

When I tested in the past, I found that:

When the cluster is distributed in hardware infrastructure, with various performences, I had strange behaviors in elasticsearch.

One thing I was aware of is that the queues for thread_pool were pretty high.

When I moved the VMs to hosts with processor, and memory alike, the cluster was stable and smooth.

Is there any recommendation on this?
If it is not recommended to build the cluster on equipment with different performances?
Or some documentation?

Thank you very much.

Re: Cluster with different hardware

Posted: Wed Mar 07, 2018 1:59 pm
by scottwilkerson
You hit the nail on the head.

Because of how the data is shared in the cluster, it is best to configure the Instances as close to the same hardware specs as possible.

In testing we found the same thing, that often the power of the cluster can be limited to it's weakest link because when you are indexing or searching, the load is spread out across the nodes, and the results cannot come back until the slowest machine gives it's result set which is often up to 5 instances even when searching 1 days logs.

The most important factors in hardware for a fast stable environment is memory and disk speed.

Re: Cluster with different hardware

Posted: Wed Mar 07, 2018 3:18 pm
by ssoliveira
Hi Scott,

Thank you very much for the information, and confirmation of the tests.

In order to protect ourselves from possible future problems, hardware selection will be based on these assumptions.

Equal processors.
Memory equal, with equal speeds.
Discs with good performente, and equal.
All under the same physical switch stack

Excellent, thank you very much.

Sharing an old history.

One lesson learned was (do not build the cluster, with the firewall in the middle). Our first environment was multi-rack (distant locations) (with firewall in the middle). And we had a lot of trouble until we found out that the firewall was knocking down established TCP connections after a certain amount of time.

After we removed the firewall, the environment became much more stable, and beautiful.

I believe we can close the ticket.