Index-level shard allocation filtering

edit

Index-level shard allocation filtering

edit

You can use shard allocation filters to control where Elasticsearch allocates shards of a particular index. These per-index filters are applied in conjunction with cluster-wide allocation filtering and allocation awareness.

Shard allocation filters can be based on custom node attributes or the built-in _name, _host_ip, _publish_ip, _ip, _host, _id, _tier and _tier_preference attributes. Index lifecycle management uses filters based on custom node attributes to determine how to reallocate shards when moving between phases.

The cluster.routing.allocation settings are dynamic, enabling existing indices to be moved immediately from one set of nodes to another. Shards are only relocated if it is possible to do so without breaking another routing constraint, such as never allocating a primary and replica shard on the same node.

For example, you could use a custom node attribute to indicate a node’s performance characteristics and use shard allocation filtering to route shards for a particular index to the most appropriate class of hardware.

Enabling index-level shard allocation filtering

edit

To filter based on a custom node attribute:

  1. Specify the filter characteristics with a custom node attribute in each node’s elasticsearch.yml configuration file. For example, if you have small, medium, and big nodes, you could add a size attribute to filter based on node size.

    node.attr.size: medium

    You can also set custom attributes when you start a node:

    ./bin/elasticsearch -Enode.attr.size=medium
  2. Add a routing allocation filter to the index. The index.routing.allocation settings support three types of filters: include, exclude, and require. For example, to tell Elasticsearch to allocate shards from the test index to either big or medium nodes, use index.routing.allocation.include:

    resp = client.indices.put_settings(
        index="test",
        body={"index.routing.allocation.include.size": "big,medium"},
    )
    print(resp)
    response = client.indices.put_settings(
      index: 'test',
      body: {
        'index.routing.allocation.include.size' => 'big,medium'
      }
    )
    puts response
    PUT test/_settings
    {
      "index.routing.allocation.include.size": "big,medium"
    }

    If you specify multiple filters the following conditions must be satisfied simultaneously by a node in order for shards to be relocated to it:

    • If any require type conditions are specified, all of them must be satisfied
    • If any exclude type conditions are specified, none of them may be satisfied
    • If any include type conditions are specified, at least one of them must be satisfied

    For example, to move the test index to big nodes in rack1, you could specify:

    resp = client.indices.put_settings(
        index="test",
        body={
            "index.routing.allocation.require.size": "big",
            "index.routing.allocation.require.rack": "rack1",
        },
    )
    print(resp)
    response = client.indices.put_settings(
      index: 'test',
      body: {
        'index.routing.allocation.require.size' => 'big',
        'index.routing.allocation.require.rack' => 'rack1'
      }
    )
    puts response
    PUT test/_settings
    {
      "index.routing.allocation.require.size": "big",
      "index.routing.allocation.require.rack": "rack1"
    }

Index allocation filter settings

edit
index.routing.allocation.include.{attribute}
Assign the index to a node whose {attribute} has at least one of the comma-separated values.
index.routing.allocation.require.{attribute}
Assign the index to a node whose {attribute} has all of the comma-separated values.
index.routing.allocation.exclude.{attribute}
Assign the index to a node whose {attribute} has none of the comma-separated values.

The index allocation settings support the following built-in attributes:

_name

Match nodes by node name

_host_ip

Match nodes by host IP address (IP associated with hostname)

_publish_ip

Match nodes by publish IP address

_ip

Match either _host_ip or _publish_ip

_host

Match nodes by hostname

_id

Match nodes by node id

_tier

Match nodes by the node’s data tier role. For more details see data tier allocation filtering

_tier filtering is based on node roles. Only a subset of roles are data tier roles, and the generic data role will match any tier filtering.

You can use wildcards when specifying attribute values, for example:

resp = client.indices.put_settings(
    index="test",
    body={"index.routing.allocation.include._ip": "192.168.2.*"},
)
print(resp)
response = client.indices.put_settings(
  index: 'test',
  body: {
    'index.routing.allocation.include._ip' => '192.168.2.*'
  }
)
puts response
PUT test/_settings
{
  "index.routing.allocation.include._ip": "192.168.2.*"
}