Rebalance/Failover
Rebalance
The search service maintains a cluster-wide set of index definitions and metadata, which allows the redistribution of indexes and index replicas during a rebalance.
Handling partitions during a rebalance
The search service redistributes the index partitions across the available search service nodes for a balanced partition-node assignment during a rebalance operation. The newly assigned index partitions are built utilizing the DCP feed on the new nodes. Once the new partitions are built up to the most current or latest sequence numbers, the partitions are promoted to handle live traffic, and the older partitions are deleted from the system.
During rebalancing, the search service does not wait for all replicas of an index to be built. This is expected behavior since the replica partitions are not used for serving FTS query traffic. |
Impacts on the live traffic updates
The live traffic is never functionally affected. However, we cannot entirely rule out the performance impacts of a background rebalance operation on the live cluster traffic. If the user has mutations happening on their data service nodes, the new data will continue to be ingested into the FTS indexes despite the rebalance/failover operation.
Tips for faster rebalance operations
As the rebalance is a resource-consuming cluster management operation, it is always recommended to perform rebalancing during off-peak hours.
Search service moves or builds the index partitions one at a time per node during the rebalance operations, which could significantly increase the overall time taken for the rebalance operation.
There is a configurable option [maxConcurrentPartitionMovesPerNode] to bring the additional concurrency to how we move/build partitions during a rebalance operation.
By overriding the maxConcurrentPartitionMovesPerNode parameter using the runtime cluster option, we can control the concurrency of partition moves, enabling faster rebalance completion times.
Knowing what value to configure this parameter depends on the particular use case and available network bandwidth. Adequate testing must be done in a pre-production environment prior to deploying production clusters with this setting.
Configuring the maximum concurrent partition moves per node
Use the update endpoint for manager options.
curl -XPUT -u <username>:<password> http://<nodeIP>:8094/api/managerOptions -d '
{"maxConcurrentPartitionMovesPerNode":"3"}'
Checking the current value for maxConcurrentPartitionMovesPerNode
in a cluster
curl -XGET -u <username>:<password> http://<nodeIP>:8094/api/manager
Note: When multiple partitions are built parallel, it needs more RAM and mandates a higher RAM quota. As the rebalance operations consume resources, it is always advisable to plan them during non-peak hours. One way to speed up the rebalance operation is to enable the movement of partitions in parallel by configuring the manager option maxConcurrentPartitionMovesPerNode
as shown above.
Dealing with rebalance failures
Failovers
During failover of FTS service nodes, there is no partition movement, and hence the failover-rebalance is instantaneous. Search service promotes the replica index partitions to primary so that those serve the live cluster traffic instantly. Users can use failover and recovery rebalance while applying patches or upgrading software/hardware.
Things to keep track of during a rebalance
A failed over node can be recovered: added back into the cluster from which it was failed over through the rebalance operation. The index partitions residing on the recovered node would be reused during a recovery rebalance operation with no extra node additions/removals. This speeds up the recovery rebalance operation.
Failover-recovery steps
-
Failover the node that needs a quick software update or hardware maintenance.
-
Failover triggers replica partitions to become active. With replica partitions, live traffic can be served seamlessly.
-
Perform the software update or hardware maintenance operations.
-
Perform recover rebalance operation.
-
Cluster is back to normal/pre-failover safe state.
Why rebalance button remains active even after a successful rebalance
Even after a successful Search rebalance operation, if the rebalance button remains active then it is indicative of the cluster not containing enough nodes that host the Search service to support the configured number of replicas for the index.
Though validation checks ensures that there are enough search nodes to support the configured number of replica
partitions, the number of search nodes
could get compromised during the lifespan of a cluster due to various unforeseen events like hardware/network failures. And the active
rebalance button issue surfaces if the subsequent recovery or node addition rebalance operations for Search service isn’t bringing in enough nodes to support the configured replica count
for the partitions.
The user could resolve this by either,
-
adding enough Search nodes to the cluster to support the configured
replica
count. -
editing the index definition(s) to adjust the replica count plausible in the cluster.
The maximum number of replica count feasible in a cluster is one less than the total number of Search nodes in the cluster. Maximum feasible replica count = Total number of search nodes - 1. |