Disk-based shard allocation
editDisk-based shard allocation
editElasticsearch considers the available disk space on a node before deciding whether to allocate new shards to that node or to actively relocate shards away from that node.
Below are the settings that can be configured in the elasticsearch.yml
config
file or updated dynamically on a live cluster with the
cluster-update-settings API:
-
cluster.routing.allocation.disk.threshold_enabled
-
Defaults to
true
. Set tofalse
to disable the disk allocation decider. -
cluster.routing.allocation.disk.watermark.low
-
Controls the low watermark for disk usage. It defaults to
85%
, meaning that Elasticsearch will not allocate shards to nodes that have more than 85% disk used. It can also be set to an absolute byte value (like500mb
) to prevent Elasticsearch from allocating shards if less than the specified amount of space is available. This setting has no effect on the primary shards of newly-created indices but will prevent their replicas from being allocated. -
cluster.routing.allocation.disk.watermark.high
-
Controls the high watermark. It defaults to
90%
, meaning that Elasticsearch will attempt to relocate shards away from a node whose disk usage is above 90%. It can also be set to an absolute byte value (similarly to the low watermark) to relocate shards away from a node if it has less than the specified amount of free space. This setting affects the allocation of all shards, whether previously allocated or not.
-
cluster.routing.allocation.disk.watermark.flood_stage
-
Controls the flood stage watermark, which defaults to 95%. Elasticsearch enforces a read-only index block (
index.blocks.read_only_allow_delete
) on every index that has one or more shards allocated on the node, and that has at least one disk exceeding the flood stage. This setting is a last resort to prevent nodes from running out of disk space. The index block is automatically released when the disk utilization falls below the high watermark.You cannot mix the usage of percentage values and byte values within these settings. Either all values are set to percentage values, or all are set to byte values. This enforcement is so that Elasticsearch can validate that the settings are internally consistent, ensuring that the low disk threshold is less than the high disk threshold, and the high disk threshold is less than the flood stage threshold.
An example of resetting the read-only index block on the
twitter
index:PUT /twitter/_settings { "index.blocks.read_only_allow_delete": null }
-
cluster.info.update.interval
-
How often Elasticsearch should check on disk usage for each node in the
cluster. Defaults to
30s
. -
cluster.routing.allocation.disk.include_relocations
-
[7.5.0]
Deprecated in 7.5.0. Future versions will always account for relocations.
Defaults to
true
, which means that Elasticsearch will take into account shards that are currently being relocated to the target node when computing a node’s disk usage. Taking relocating shards' sizes into account may, however, mean that the disk usage for a node is incorrectly estimated on the high side, since the relocation could be 90% complete and a recently retrieved disk usage would include the total size of the relocating shard as well as the space already used by the running relocation.
Percentage values refer to used disk space, while byte values refer to free disk space. This can be confusing, since it flips the meaning of high and low. For example, it makes sense to set the low watermark to 10gb and the high watermark to 5gb, but not the other way around.
An example of updating the low watermark to at least 100 gigabytes free, a high watermark of at least 50 gigabytes free, and a flood stage watermark of 10 gigabytes free, and updating the information about the cluster every minute:
PUT _cluster/settings { "transient": { "cluster.routing.allocation.disk.watermark.low": "100gb", "cluster.routing.allocation.disk.watermark.high": "50gb", "cluster.routing.allocation.disk.watermark.flood_stage": "10gb", "cluster.info.update.interval": "1m" } }