Disk-based Shard Allocation

edit

Elasticsearch can be configured to prevent shard allocation on nodes depending on disk usage for the node. This functionality is enabled by default, and can be changed either in the configuration file, or dynamically using:

curl -XPUT localhost:9200/_cluster/settings -d '{
    "transient" : {
        "cluster.routing.allocation.disk.threshold_enabled" : false
    }
}'

Once enabled, Elasticsearch uses two watermarks to decide whether shards should be allocated or can remain on the node.

cluster.routing.allocation.disk.watermark.low controls the low watermark for disk usage. It defaults to 85%, meaning ES will not allocate new shards to nodes once they have more than 85% disk used. It can also be set to an absolute byte value (like 500mb) to prevent ES from allocating shards if less than the configured amount of space is available.

cluster.routing.allocation.disk.watermark.high controls the high watermark. It defaults to 90%, meaning ES will attempt to relocate shards to another node if the node disk usage rises above 90%. It can also be set to an absolute byte value (similar to the low watermark) to relocate shards once less than the configured amount of space is available on the node.

Percentage values refer to used disk space, while byte values refer to free disk space. This can be confusing, since it flips the meaning of high and low. For example, it makes sense to set the low watermark to 10gb and the high watermark to 5gb, but not the other way around.

Both watermark settings can be changed dynamically using the cluster settings API. By default, Elasticsearch will retrieve information about the disk usage of the nodes every 30 seconds. This can also be changed by setting the cluster.info.update.interval setting.

An example of updating the low watermark to no more than 80% of the disk size, a high watermark of at least 50 gigabytes free, and updating the information about the cluster every minute:

curl -XPUT localhost:9200/_cluster/settings -d '{
    "transient" : {
        "cluster.routing.allocation.disk.watermark.low" : "80%",
        "cluster.routing.allocation.disk.watermark.high" : "50gb",
        "cluster.info.update.interval" : "1m"
    }
}'

By default, Elasticsearch will take into account shards that are currently being relocated to the target node when computing a node’s disk usage. This can be changed by setting the cluster.routing.allocation.disk.include_relocations setting to false (defaults to true). Taking relocating shards' sizes into account may, however, mean that the disk usage for a node is incorrectly estimated on the high side, since the relocation could be 90% complete and a recently retrieved disk usage would include the total size of the relocating shard as well as the space already used by the running relocation.