- Elasticsearch Guide: other versions:
- Setup
- API Conventions
- Document APIs
- Search APIs
- Indices APIs
- Create Index
- Delete Index
- Indices Exists
- Open / Close Index API
- Put Mapping
- Get Mapping
- Get Field Mapping
- Types Exists
- Delete Mapping
- Index Aliases
- Update Indices Settings
- Get Settings
- Analyze
- Index Templates
- Warmers
- Status
- Indices Stats
- Indices Segments
- Clear Cache
- Flush
- Refresh
- Optimize
- Gateway Snapshot
- Cluster APIs
- Query DSL
- Queries
- Match Query
- Multi Match Query
- Bool Query
- Boosting Query
- Common Terms Query
- Custom Filters Score Query
- Custom Score Query
- Custom Boost Factor Query
- Constant Score Query
- Dis Max Query
- Field Query
- Filtered Query
- Fuzzy Like This Query
- Fuzzy Like This Field Query
- Function Score Query
- Fuzzy Query
- GeoShape Query
- Has Child Query
- Has Parent Query
- Ids Query
- Indices Query
- Match All Query
- More Like This Query
- More Like This Field Query
- Nested Query
- Prefix Query
- Query String Query
- Simple Query String Query
- Range Query
- Regexp Query
- Span First Query
- Span Multi Term Query
- Span Near Query
- Span Not Query
- Span Or Query
- Span Term Query
- Term Query
- Terms Query
- Top Children Query
- Wildcard Query
- Text Query
- Minimum Should Match
- Multi Term Query Rewrite
- Filters
- And Filter
- Bool Filter
- Exists Filter
- Geo Bounding Box Filter
- Geo Distance Filter
- Geo Distance Range Filter
- Geo Polygon Filter
- GeoShape Filter
- Geohash Cell Filter
- Has Child Filter
- Has Parent Filter
- Ids Filter
- Indices Filter
- Limit Filter
- Match All Filter
- Missing Filter
- Nested Filter
- Not Filter
- Numeric Range Filter
- Or Filter
- Prefix Filter
- Query Filter
- Range Filter
- Regexp Filter
- Script Filter
- Term Filter
- Terms Filter
- Type Filter
- Queries
- Mapping
- Analysis
- Analyzers
- Tokenizers
- Token Filters
- Standard Token Filter
- ASCII Folding Token Filter
- Length Token Filter
- Lowercase Token Filter
- NGram Token Filter
- Edge NGram Token Filter
- Porter Stem Token Filter
- Shingle Token Filter
- Stop Token Filter
- Word Delimiter Token Filter
- Stemmer Token Filter
- Stemmer Override Token Filter
- Keyword Marker Token Filter
- Keyword Repeat Token Filter
- KStem Token Filter
- Snowball Token Filter
- Phonetic Token Filter
- Synonym Token Filter
- Compound Word Token Filter
- Reverse Token Filter
- Elision Token Filter
- Truncate Token Filter
- Unique Token Filter
- Pattern Capture Token Filter
- Pattern Replace Token Filter
- Trim Token Filter
- Limit Token Count Token Filter
- Hunspell Token Filter
- Common Grams Token Filter
- Normalization Token Filter
- Keep Words Token Filter
- Delimited Payload Token Filter
- Character Filters
- ICU Analysis Plugin
- Modules
- Index Modules
- Glossary of terms
WARNING: Version 0.90 of Elasticsearch has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
editShards Allocation
editShards allocation is the process of allocating shards to nodes. This can happen during initial recovery, replica allocation, rebalancing, or handling nodes being added or removed.
The following settings may be used:
Allow to control when rebalancing will happen based on the total
state of all the indices shards in the cluster.
, andindices_all_active
are allowed, defaulting toindices_all_active
to reduce chatter during initial recovery. -
Allow to control how many concurrent rebalancing of shards are
allowed cluster wide, and default it to
. -
- Allow to control specifically the number of initial recoveries of primaries that are allowed per node. Since most times local gateway is used, those should be fast and we can handle more of those per node without creating load.
How many concurrent recoveries are allowed to happen on a node.
Defaults to
. -
- Allows to disable new primary allocations. Note, this will prevent allocations for newly created indices. This setting really make sense when dynamically updating it using the cluster update settings API.
Allows to disable either primary or replica allocation (does not
apply to newly created primaries, see
above). Note, a replica will still be promoted to primary if one does not exist. This setting really make sense when dynamically updating it using the cluster update settings API. -
- Allows to disable only replica allocation. Similar to the previous setting, mainly make sense when using it dynamically using the cluster update settings API.
Allows to perform a check to prevent allocation of multiple instances
of the same shard on a single host, based on host name and host address.
Defaults to
, meaning that no check is performed by default. This setting only applies if multiple nodes are started on the same machine. -
The number of streams to open (on a node level) to recover a
shard from a peer shard. Defaults to
Shard Allocation Awareness
editCluster allocation awareness allows to configure shard and replicas allocation across generic attributes associated the nodes. Lets explain it through an example:
Assume we have several racks. When we start a node, we can configure an
attribute called rack_id
(any attribute name works), for example, here
is a sample config:
node.rack_id: rack_one
The above sets an attribute called rack_id
for the relevant node with
a value of rack_one
. Now, we need to configure the rack_id
as one of the awareness allocation attributes (set it on all (master
eligible) nodes config):
cluster.routing.allocation.awareness.attributes: rack_id
The above will mean that the rack_id
attribute will be used to do
awareness based allocation of shard and its replicas. For example, lets
say we start 2 nodes with node.rack_id
set to rack_one
, and deploy a
single index with 5 shards and 1 replica. The index will be fully
deployed on the current nodes (5 shards and 1 replica each, total of 10
Now, if we start two more nodes, with node.rack_id
set to rack_two
shards will relocate to even the number of shards across the nodes, but,
a shard and its replica will not be allocated in the same rack_id
The awareness attributes can hold several values, for example:
cluster.routing.allocation.awareness.attributes: rack_id,zone
NOTE: When using awareness attributes, shards will not be allocated to nodes that don’t have values set for those attributes.
Forced Awareness
editSometimes, we know in advance the number of values an awareness attribute can have, and more over, we would like never to have more replicas then needed allocated on a specific group of nodes with the same awareness attribute value. For that, we can force awareness on specific attributes.
For example, lets say we have an awareness attribute called zone
, and
we know we are going to have two zones, zone1
and zone2
. Here is how
we can force awareness one a node:
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2 cluster.routing.allocation.awareness.attributes: zone
Now, lets say we start 2 nodes with node.zone
set to zone1
create an index with 5 shards and 1 replica. The index will be created,
but only 5 shards will be allocated (with no replicas). Only when we
start more shards with node.zone
set to zone2
will the replicas be
Automatic Preference When Searching / GETing
editWhen executing a search, or doing a get, the node receiving the request will prefer to execute the request on shards that exists on nodes that have the same attribute values as the executing node.
Realtime Settings Update
editThe settings can be updated using the cluster update settings API on a live cluster.
Shard Allocation Filtering
editAllow to control allocation if indices on nodes based on include/exclude filters. The filters can be set both on the index level and on the cluster level. Lets start with an example of setting it on the cluster level:
Lets say we have 4 nodes, each has specific attribute called tag
associated with it (the name of the attribute can be any name). Each
node has a specific value associated with tag
. Node 1 has a setting
node.tag: value1
, Node 2 a setting of node.tag: value2
, and so on.
We can create an index that will only deploy on nodes that have tag
set to value1
and value2
by setting
to value1,value2
. For example:
curl -XPUT localhost:9200/test/_settings -d '{ "index.routing.allocation.include.tag" : "value1,value2" }'
On the other hand, we can create an index that will be deployed on all
nodes except for nodes with a tag
of value value3
by setting
to value3
. For example:
curl -XPUT localhost:9200/test/_settings -d '{ "index.routing.allocation.exclude.tag" : "value3" }'
From version 0.90, index.routing.allocation.require.*
can be used to
specify a number of rules, all of which MUST match in order for a shard
to be allocated to a node. This is in contrast to include
which will
include a node if ANY rule matches.
The include
, exclude
and require
values can have generic simple
matching wildcards, for example, value1*
. A special attribute name
called _ip
can be used to match on node ip values. In addition _host
attribute can be used to match on either the node’s hostname or its ip
address. Similarly _name
and _id
attributes can be used to match on
node name and node id accordingly.
Obviously a node can have several attributes associated with it, and both the attribute name and value are controlled in the setting. For example, here is a sample of several node configurations:
node.group1: group1_value1 node.group2: group2_value4
In the same manner, include
, exclude
and require
can work against
several attributes, for example:
curl -XPUT localhost:9200/test/_settings -d '{ "index.routing.allocation.include.group1" : "xxx" "index.routing.allocation.include.group2" : "yyy", "index.routing.allocation.exclude.group3" : "zzz", "index.routing.allocation.require.group4" : "aaa" }'
The provided settings can also be updated in real time using the update settings API, allowing to "move" indices (shards) around in realtime.
Cluster wide filtering can also be defined, and be updated in real time
using the cluster update settings API. This setting can come in handy
for things like decommissioning nodes (even if the replica count is set
to 0). Here is a sample of how to decommission a node based on _ip
curl -XPUT localhost:9200/_cluster/settings -d '{ "transient" : { "cluster.routing.allocation.exclude._ip" : "" } }'
On this page