WARNING: Version 5.0 of Elasticsearch has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
Cluster Reroute
editCluster Reroute
editThe reroute command allows to explicitly execute a cluster reroute allocation command including specific commands. For example, a shard can be moved from one node to another explicitly, an allocation can be canceled, or an unassigned shard can be explicitly allocated on a specific node.
Here is a short example of how a simple reroute API call:
curl -XPOST 'localhost:9200/_cluster/reroute' -d '{ "commands" : [ { "move" : { "index" : "test", "shard" : 0, "from_node" : "node1", "to_node" : "node2" } }, { "allocate_replica" : { "index" : "test", "shard" : 1, "node" : "node3" } } ] }'
An important aspect to remember is the fact that once when an allocation
occurs, the cluster will aim at re-balancing its state back to an even
state. For example, if the allocation includes moving a shard from
node1
to node2
, in an even
state, then another shard will be moved
from node2
to node1
to even things out.
The cluster can be set to disable allocations, which means that only the explicitly allocations will be performed. Obviously, only once all commands has been applied, the cluster will aim to be re-balance its state.
Another option is to run the commands in dry_run
(as a URI flag, or in
the request body). This will cause the commands to apply to the current
cluster state, and return the resulting cluster after the commands (and
re-balancing) has been applied.
If the explain
parameter is specified, a detailed explanation of why the
commands could or could not be executed is returned.
The commands supported are:
-
move
-
Move a started shard from one node to another node. Accepts
index
andshard
for index name and shard number,from_node
for the node to move the shardfrom
, andto_node
for the node to move the shard to. -
cancel
-
Cancel allocation of a shard (or recovery). Accepts
index
andshard
for index name and shard number, andnode
for the node to cancel the shard allocation on. It also acceptsallow_primary
flag to explicitly specify that it is allowed to cancel allocation for a primary shard. This can be used to force resynchronization of existing replicas from the primary shard by cancelling them and allowing them to be reinitialized through the standard reallocation process. -
allocate_replica
-
Allocate an unassigned replica shard to a node. Accepts the
index
andshard
for index name and shard number, andnode
to allocate the shard to. Takes allocation deciders into account.
Two more commands are available that allow the allocation of a primary shard to a node. These commands should however be used with extreme care, as primary shard allocation is usually fully automatically handled by Elasticsearch. Reasons why a primary shard cannot be automatically allocated include the following:
- A new index was created but there is no node which satisfies the allocation deciders.
- An up-to-date shard copy of the data cannot be found on the current data nodes in the cluster. To prevent data loss, the system does not automatically promote a stale shard copy to primary.
As a manual override, two commands to forcefully allocate primary shards are available:
-
allocate_stale_primary
-
Allocate a primary shard to a node that holds a stale copy. Accepts the
index
andshard
for index name and shard number, andnode
to allocate the shard to. Using this command may lead to data loss for the provided shard id. If a node which has the good copy of the data rejoins the cluster later on, that data will be overwritten with the data of the stale copy that was forcefully allocated with this command. To ensure that these implications are well-understood, this command requires the special fieldaccept_data_loss
to be explicitly set totrue
for it to work. -
allocate_empty_primary
-
Allocate an empty primary shard to a node. Accepts the
index
andshard
for index name and shard number, andnode
to allocate the shard to. Using this command leads to a complete loss of all data that was indexed into this shard, if it was previously started. If a node which has a copy of the data rejoins the cluster later on, that data will be deleted! To ensure that these implications are well-understood, this command requires the special fieldaccept_data_loss
to be explicitly set totrue
for it to work.
Retry failed shards
editThe cluster will attempt to allocate a shard a maximum of
index.allocation.max_retries
times in a row (defaults to 5
), before giving
up and leaving the shard unallocated. This scenario can be caused by
structural problems such as having an analyzer which refers to a stopwords
file which doesn’t exist on all nodes.
Once the problem has been corrected, allocation can be manually retried by
calling the _reroute
API with ?retry_failed
, which
will attempt a single retry round for these shards.