_routing field

edit

A document is routed to a particular shard in an index using the following formula:

shard_num = hash(_routing) % num_primary_shards

The default value used for _routing is the document’s _id or the document’s _parent ID, if present.

Custom routing patterns can be implemented by specifying a custom routing value per document. For instance:

PUT my_index/my_type/1?routing=user1&refresh=true 
{
  "title": "This is a document"
}

GET my_index/my_type/1?routing=user1 

This document uses user1 as its routing value, instead of its ID.

The same routing value needs to be provided when getting, deleting, or updating the document.

The value of the _routing field is accessible in queries:

GET my_index/_search
{
  "query": {
    "terms": {
      "_routing": [ "user1" ] 
    }
  }
}

Querying on the _routing field (also see the ids query)

Searching with custom routing

edit

Custom routing can reduce the impact of searches. Instead of having to fan out a search request to all the shards in an index, the request can be sent to just the shard that matches the specific routing value (or values):

GET my_index/_search?routing=user1,user2 
{
  "query": {
    "match": {
      "title": "document"
    }
  }
}

This search request will only be executed on the shards associated with the user1 and user2 routing values.

Making a routing value required

edit

When using custom routing, it is important to provide the routing value whenever indexing, getting, deleting, or updating a document.

Forgetting the routing value can lead to a document being indexed on more than one shard. As a safeguard, the _routing field can be configured to make a custom routing value required for all CRUD operations:

PUT my_index2
{
  "mappings": {
    "my_type": {
      "_routing": {
        "required": true 
      }
    }
  }
}

PUT my_index2/my_type/1 
{
  "text": "No routing value provided"
}

Routing is required for my_type documents.

This index request throws a routing_missing_exception.

Unique IDs with custom routing

edit

When indexing documents specifying a custom _routing, the uniqueness of the _id is not guaranteed across all of the shards in the index. In fact, documents with the same _id might end up on different shards if indexed with different _routing values.

It is up to the user to ensure that IDs are unique across the index.

Routing to an index partition

edit

An index can be configured such that custom routing values will go to a subset of the shards rather than a single shard. This helps mitigate the risk of ending up with an imbalanced cluster while still reducing the impact of searches.

This is done by providing the index level setting index.routing_partition_size at index creation. As the partition size increases, the more evenly distributed the data will become at the expense of having to search more shards per request.

When this setting is present, the formula for calculating the shard becomes:

shard_num = (hash(_routing) + hash(_id) % routing_partition_size) % num_primary_shards

That is, the _routing field is used to calculate a set of shards within the index and then the _id is used to pick a shard within that set.

To enable this feature, the index.routing_partition_size should have a value greater than 1 and less than index.number_of_shards.

Once enabled, the partitioned index will have the following limitations:

  • Mappings with parent-child relationships cannot be created within it.
  • All mappings within the index must have the _routing field marked as required.