- Elasticsearch Guide: other versions:
- What is Elasticsearch?
- What’s new in 7.10
- Getting started with Elasticsearch
- Set up Elasticsearch
- Installing Elasticsearch
- Configuring Elasticsearch
- Setting JVM options
- Secure settings
- Auditing settings
- Circuit breaker settings
- Cluster-level shard allocation and routing settings
- Cross-cluster replication settings
- Discovery and cluster formation settings
- Field data cache settings
- HTTP
- Index lifecycle management settings
- Index management settings
- Index recovery settings
- Indexing buffer settings
- License settings
- Local gateway settings
- Logging
- Machine learning settings
- Monitoring settings
- Node
- Network settings
- Node query cache settings
- Search settings
- Security settings
- Shard request cache settings
- Snapshot lifecycle management settings
- Transforms settings
- Transport
- Thread pools
- Watcher settings
- Important Elasticsearch configuration
- Important System Configuration
- Bootstrap Checks
- Heap size check
- File descriptor check
- Memory lock check
- Maximum number of threads check
- Max file size check
- Maximum size virtual memory check
- Maximum map count check
- Client JVM check
- Use serial collector check
- System call filter check
- OnError and OnOutOfMemoryError checks
- Early-access check
- G1GC check
- All permission check
- Discovery configuration check
- Bootstrap Checks for X-Pack
- Starting Elasticsearch
- Stopping Elasticsearch
- Discovery and cluster formation
- Add and remove nodes in your cluster
- Full-cluster restart and rolling restart
- Remote clusters
- Set up X-Pack
- Configuring X-Pack Java Clients
- Plugins
- Upgrade Elasticsearch
- Index modules
- Mapping
- Text analysis
- Overview
- Concepts
- Configure text analysis
- Built-in analyzer reference
- Tokenizer reference
- Token filter reference
- Apostrophe
- ASCII folding
- CJK bigram
- CJK width
- Classic
- Common grams
- Conditional
- Decimal digit
- Delimited payload
- Dictionary decompounder
- Edge n-gram
- Elision
- Fingerprint
- Flatten graph
- Hunspell
- Hyphenation decompounder
- Keep types
- Keep words
- Keyword marker
- Keyword repeat
- KStem
- Length
- Limit token count
- Lowercase
- MinHash
- Multiplexer
- N-gram
- Normalization
- Pattern capture
- Pattern replace
- Phonetic
- Porter stem
- Predicate script
- Remove duplicates
- Reverse
- Shingle
- Snowball
- Stemmer
- Stemmer override
- Stop
- Synonym
- Synonym graph
- Trim
- Truncate
- Unique
- Uppercase
- Word delimiter
- Word delimiter graph
- Character filters reference
- Normalizers
- Index templates
- Data streams
- Ingest node
- Search your data
- Query DSL
- Aggregations
- Bucket aggregations
- Adjacency matrix
- Auto-interval date histogram
- Children
- Composite
- Date histogram
- Date range
- Diversified sampler
- Filter
- Filters
- Geo-distance
- Geohash grid
- Geotile grid
- Global
- Histogram
- IP range
- Missing
- Nested
- Parent
- Range
- Rare terms
- Reverse nested
- Sampler
- Significant terms
- Significant text
- Terms
- Variable width histogram
- Subtleties of bucketing range fields
- Metrics aggregations
- Pipeline aggregations
- Bucket aggregations
- EQL
- SQL access
- Overview
- Getting Started with SQL
- Conventions and Terminology
- Security
- SQL REST API
- SQL Translate API
- SQL CLI
- SQL JDBC
- SQL ODBC
- SQL Client Applications
- SQL Language
- Functions and Operators
- Comparison Operators
- Logical Operators
- Math Operators
- Cast Operators
- LIKE and RLIKE Operators
- Aggregate Functions
- Grouping Functions
- Date/Time and Interval Functions and Operators
- Full-Text Search Functions
- Mathematical Functions
- String Functions
- Type Conversion Functions
- Geo Functions
- Conditional Functions And Expressions
- System Functions
- Reserved keywords
- SQL Limitations
- Scripting
- Data management
- ILM: Manage the index lifecycle
- Overview
- Concepts
- Automate rollover
- Manage Filebeat time-based indices
- Index lifecycle actions
- Configure a lifecycle policy
- Migrate index allocation filters to node roles
- Resolve lifecycle policy execution errors
- Start and stop index lifecycle management
- Manage existing indices
- Skip rollover
- Restore a managed data stream or index
- Monitor a cluster
- Frozen indices
- Roll up or transform your data
- Set up a cluster for high availability
- Snapshot and restore
- Secure a cluster
- Overview
- Configuring security
- User authentication
- Built-in users
- Internal users
- Token-based authentication services
- Realms
- Realm chains
- Active Directory user authentication
- File-based user authentication
- LDAP user authentication
- Native user authentication
- OpenID Connect authentication
- PKI user authentication
- SAML authentication
- Kerberos authentication
- Integrating with other authentication systems
- Enabling anonymous access
- Controlling the user cache
- Configuring SAML single-sign-on on the Elastic Stack
- Configuring single sign-on to the Elastic Stack using OpenID Connect
- User authorization
- Built-in roles
- Defining roles
- Granting access to Stack Management features
- Security privileges
- Document level security
- Field level security
- Granting privileges for data streams and index aliases
- Mapping users and groups to roles
- Setting up field and document level security
- Submitting requests on behalf of other users
- Configuring authorization delegation
- Customizing roles and authorization
- Enabling audit logging
- Encrypting communications
- Restricting connections with IP filtering
- Cross cluster search, clients, and integrations
- Tutorial: Getting started with security
- Tutorial: Encrypting communications
- Troubleshooting
- Some settings are not returned via the nodes settings API
- Authorization exceptions
- Users command fails due to extra arguments
- Users are frequently locked out of Active Directory
- Certificate verification fails for curl on Mac
- SSLHandshakeException causes connections to fail
- Common SSL/TLS exceptions
- Common Kerberos exceptions
- Common SAML issues
- Internal Server Error in Kibana
- Setup-passwords command fails due to connection failure
- Failures due to relocation of the configuration files
- Limitations
- Watch for cluster and index events
- Command line tools
- How To
- Glossary of terms
- REST APIs
- API conventions
- Compact and aligned text (CAT) APIs
- cat aliases
- cat allocation
- cat anomaly detectors
- cat count
- cat data frame analytics
- cat datafeeds
- cat fielddata
- cat health
- cat indices
- cat master
- cat nodeattrs
- cat nodes
- cat pending tasks
- cat plugins
- cat recovery
- cat repositories
- cat segments
- cat shards
- cat snapshots
- cat task management
- cat templates
- cat thread pool
- cat trained model
- cat transforms
- Cluster APIs
- Cluster allocation explain
- Cluster get settings
- Cluster health
- Cluster reroute
- Cluster state
- Cluster stats
- Cluster update settings
- Nodes feature usage
- Nodes hot threads
- Nodes info
- Nodes reload secure settings
- Nodes stats
- Pending cluster tasks
- Remote cluster info
- Task management
- Voting configuration exclusions
- Cross-cluster replication APIs
- Data stream APIs
- Document APIs
- Enrich APIs
- Graph explore API
- Index APIs
- Add index alias
- Analyze
- Clear cache
- Clone index
- Close index
- Create index
- Delete index
- Delete index alias
- Delete component template
- Delete index template
- Delete index template (legacy)
- Flush
- Force merge
- Freeze index
- Get component template
- Get field mapping
- Get index
- Get index alias
- Get index settings
- Get index template
- Get index template (legacy)
- Get mapping
- Index alias exists
- Index exists
- Index recovery
- Index segments
- Index shard stores
- Index stats
- Index template exists (legacy)
- Open index
- Put index template
- Put index template (legacy)
- Put component template
- Put mapping
- Refresh
- Rollover index
- Shrink index
- Simulate index
- Simulate template
- Split index
- Synced flush
- Type exists
- Unfreeze index
- Update index alias
- Update index settings
- Resolve index
- List dangling indices
- Import dangling index
- Delete dangling index
- Index lifecycle management APIs
- Ingest APIs
- Info API
- Licensing APIs
- Machine learning anomaly detection APIs
- Add events to calendar
- Add jobs to calendar
- Close jobs
- Create jobs
- Create calendars
- Create datafeeds
- Create filters
- Delete calendars
- Delete datafeeds
- Delete events from calendar
- Delete filters
- Delete forecasts
- Delete jobs
- Delete jobs from calendar
- Delete model snapshots
- Delete expired data
- Estimate model memory
- Find file structure
- Flush jobs
- Forecast jobs
- Get buckets
- Get calendars
- Get categories
- Get datafeeds
- Get datafeed statistics
- Get influencers
- Get jobs
- Get job statistics
- Get machine learning info
- Get model snapshots
- Get overall buckets
- Get scheduled events
- Get filters
- Get records
- Open jobs
- Post data to jobs
- Preview datafeeds
- Revert model snapshots
- Set upgrade mode
- Start datafeeds
- Stop datafeeds
- Update datafeeds
- Update filters
- Update jobs
- Update model snapshots
- Machine learning data frame analytics APIs
- Create data frame analytics jobs
- Create trained models
- Update data frame analytics jobs
- Delete data frame analytics jobs
- Delete trained models
- Evaluate data frame analytics
- Explain data frame analytics
- Get data frame analytics jobs
- Get data frame analytics jobs stats
- Get trained models
- Get trained models stats
- Start data frame analytics jobs
- Stop data frame analytics jobs
- Migration APIs
- Reload search analyzers API
- Repositories metering APIs
- Rollup APIs
- Search APIs
- Searchable snapshots APIs
- Security APIs
- Authenticate
- Change passwords
- Clear cache
- Clear roles cache
- Clear privileges cache
- Clear API key cache
- Create API keys
- Create or update application privileges
- Create or update role mappings
- Create or update roles
- Create or update users
- Delegate PKI authentication
- Delete application privileges
- Delete role mappings
- Delete roles
- Delete users
- Disable users
- Enable users
- Get API key information
- Get application privileges
- Get builtin privileges
- Get role mappings
- Get roles
- Get token
- Get users
- Grant API keys
- Has privileges
- Invalidate API key
- Invalidate token
- OpenID Connect prepare authentication
- OpenID Connect authenticate
- OpenID Connect logout
- SAML prepare authentication
- SAML authenticate
- SAML logout
- SAML invalidate
- SSL certificate
- Snapshot and restore APIs
- Snapshot lifecycle management APIs
- Transform APIs
- Usage API
- Watcher APIs
- Definitions
- Migration guide
- Release notes
- Elasticsearch version 7.10.2
- Elasticsearch version 7.10.1
- Elasticsearch version 7.10.0
- Elasticsearch version 7.9.3
- Elasticsearch version 7.9.2
- Elasticsearch version 7.9.1
- Elasticsearch version 7.9.0
- Elasticsearch version 7.8.1
- Elasticsearch version 7.8.0
- Elasticsearch version 7.7.1
- Elasticsearch version 7.7.0
- Elasticsearch version 7.6.2
- Elasticsearch version 7.6.1
- Elasticsearch version 7.6.0
- Elasticsearch version 7.5.2
- Elasticsearch version 7.5.1
- Elasticsearch version 7.5.0
- Elasticsearch version 7.4.2
- Elasticsearch version 7.4.1
- Elasticsearch version 7.4.0
- Elasticsearch version 7.3.2
- Elasticsearch version 7.3.1
- Elasticsearch version 7.3.0
- Elasticsearch version 7.2.1
- Elasticsearch version 7.2.0
- Elasticsearch version 7.1.1
- Elasticsearch version 7.1.0
- Elasticsearch version 7.0.0
- Elasticsearch version 7.0.0-rc2
- Elasticsearch version 7.0.0-rc1
- Elasticsearch version 7.0.0-beta1
- Elasticsearch version 7.0.0-alpha2
- Elasticsearch version 7.0.0-alpha1
- Dependencies and versions
Cluster-level shard allocation and routing settings
editCluster-level shard allocation and routing settings
editShard allocation is the process of allocating shards to nodes. This can happen during initial recovery, replica allocation, rebalancing, or when nodes are added or removed.
One of the main roles of the master is to decide which shards to allocate to which nodes, and when to move shards between nodes in order to rebalance the cluster.
There are a number of settings available to control the shard allocation process:
- Cluster-level shard allocation settings control allocation and rebalancing operations.
- Disk-based shard allocation settings explains how Elasticsearch takes available disk space into account, and the related settings.
- Shard allocation awareness and Forced awareness control how shards can be distributed across different racks or availability zones.
- Cluster-level shard allocation filtering allows certain nodes or groups of nodes excluded from allocation so that they can be decommissioned.
Besides these, there are a few other miscellaneous cluster-level settings.
Cluster-level shard allocation settings
editYou can use the following settings to control shard allocation and recovery:
-
cluster.routing.allocation.enable
-
(Dynamic) Enable or disable allocation for specific kinds of shards:
-
all
- (default) Allows shard allocation for all kinds of shards. -
primaries
- Allows shard allocation only for primary shards. -
new_primaries
- Allows shard allocation only for primary shards for new indices. -
none
- No shard allocations of any kind are allowed for any indices.
This setting does not affect the recovery of local primary shards when restarting a node. A restarted node that has a copy of an unassigned primary shard will recover that primary immediately, assuming that its allocation id matches one of the active allocation ids in the cluster state.
-
-
cluster.routing.allocation.node_concurrent_incoming_recoveries
-
(Dynamic)
How many concurrent incoming shard recoveries are allowed to happen on a node. Incoming recoveries are the recoveries
where the target shard (most likely the replica unless a shard is relocating) is allocated on the node. Defaults to
2
. -
cluster.routing.allocation.node_concurrent_outgoing_recoveries
-
(Dynamic)
How many concurrent outgoing shard recoveries are allowed to happen on a node. Outgoing recoveries are the recoveries
where the source shard (most likely the primary unless a shard is relocating) is allocated on the node. Defaults to
2
. -
cluster.routing.allocation.node_concurrent_recoveries
-
(Dynamic)
A shortcut to set both
cluster.routing.allocation.node_concurrent_incoming_recoveries
andcluster.routing.allocation.node_concurrent_outgoing_recoveries
. -
cluster.routing.allocation.node_initial_primaries_recoveries
-
(Dynamic)
While the recovery of replicas happens over the network, the recovery of
an unassigned primary after node restart uses data from the local disk.
These should be fast so more initial primary recoveries can happen in
parallel on the same node. Defaults to
4
. -
cluster.routing.allocation.same_shard.host
-
(Dynamic)
Allows to perform a check to prevent allocation of multiple instances of
the same shard on a single host, based on host name and host address.
Defaults to
false
, meaning that no check is performed by default. This setting only applies if multiple nodes are started on the same machine.
Shard rebalancing settings
editA cluster is balanced when it has an equal number of shards on each node without having a concentration of shards from any index on any node. Elasticsearch runs an automatic process called rebalancing which moves shards between the nodes in your cluster to improve its balance. Rebalancing obeys all other shard allocation rules such as allocation filtering and forced awareness which may prevent it from completely balancing the cluster. In that case, rebalancing strives to acheve the most balanced cluster possible within the rules you have configured. If you are using data tiers then Elasticsearch automatically applies allocation filtering rules to place each shard within the appropriate tier. These rules mean that the balancer works independently within each tier.
You can use the following settings to control the rebalancing of shards across the cluster:
-
cluster.routing.rebalance.enable
-
(Dynamic) Enable or disable rebalancing for specific kinds of shards:
-
all
- (default) Allows shard balancing for all kinds of shards. -
primaries
- Allows shard balancing only for primary shards. -
replicas
- Allows shard balancing only for replica shards. -
none
- No shard balancing of any kind are allowed for any indices.
-
-
cluster.routing.allocation.allow_rebalance
-
(Dynamic) Specify when shard rebalancing is allowed:
-
always
- Always allow rebalancing. -
indices_primaries_active
- Only when all primaries in the cluster are allocated. -
indices_all_active
- (default) Only when all shards (primaries and replicas) in the cluster are allocated.
-
-
cluster.routing.allocation.cluster_concurrent_rebalance
-
(Dynamic)
Allow to control how many concurrent shard rebalances are
allowed cluster wide. Defaults to
2
. Note that this setting only controls the number of concurrent shard relocations due to imbalances in the cluster. This setting does not limit shard relocations due to allocation filtering or forced awareness.
Shard balancing heuristics settings
editRebalancing works by computing a weight for each node based on its allocation of shards, and then moving shards between nodes to reduce the weight of the heavier nodes and increase the weight of the lighter ones. The cluster is balanced when there is no possible shard movement that can bring the weight of any node closer to the weight of any other node by more than a configurable threshold. The following settings allow you to control the details of these calculations.
-
cluster.routing.allocation.balance.shard
-
(Dynamic)
Defines the weight factor for the total number of shards allocated on a node
(float). Defaults to
0.45f
. Raising this raises the tendency to equalize the number of shards across all nodes in the cluster. -
cluster.routing.allocation.balance.index
-
(Dynamic)
Defines the weight factor for the number of shards per index allocated
on a specific node (float). Defaults to
0.55f
. Raising this raises the tendency to equalize the number of shards per index across all nodes in the cluster. -
cluster.routing.allocation.balance.threshold
-
(Dynamic)
Minimal optimization value of operations that should be performed (non
negative float). Defaults to
1.0f
. Raising this will cause the cluster to be less aggressive about optimizing the shard balance.
Regardless of the result of the balancing algorithm, rebalancing might not be allowed due to forced awareness or allocation filtering.
Disk-based shard allocation settings
editThe disk-based shard allocator ensures that all nodes have enough disk space without performing more shard movements than necessary. It allocates shards based on a pair of thresholds known as the low watermark and the high watermark. Its primary goal is to ensure that no node exceeds the high watermark, or at least that any such overage is only temporary. If a node exceeds the high watermark then Elasticsearch will solve this by moving some of its shards onto other nodes in the cluster.
It is normal for nodes to temporarily exceed the high watermark from time to time.
The allocator also tries to keep nodes clear of the high watermark by forbidding the allocation of more shards to a node that exceeds the low watermark. Importantly, if all of your nodes have exceeded the low watermark then no new shards can be allocated and Elasticsearch will not be able to move any shards between nodes in order to keep the disk usage below the high watermark. You must ensure that your cluster has enough disk space in total and that there are always some nodes below the low watermark.
Shard movements triggered by the disk-based shard allocator must also satisfy all other shard allocation rules such as allocation filtering and forced awareness. If these rules are too strict then they can also prevent the shard movements needed to keep the nodes' disk usage under control. If you are using data tiers then Elasticsearch automatically configures allocation filtering rules to place shards within the appropriate tier, which means that the disk-based shard allocator works independently within each tier.
If a node is filling up its disk faster than Elasticsearch can move shards elsewhere then there is a risk that the disk will completely fill up. To prevent this, as a last resort, once the disk usage reaches the flood-stage watermark Elasticsearch will block writes to indices with a shard on the affected node. It will also continue to move shards onto the other nodes in the cluster. When disk usage on the affected node drops below the high watermark, Elasticsearch automatically removes the write block.
It is normal for the nodes in your cluster to be using very different amounts of disk space. The balance of the cluster depends only on the number of shards on each node and the indices to which those shards belong. It considers neither the sizes of these shards nor the available disk space on each node, for the following reasons:
- Disk usage changes over time. Balancing the disk usage of individual nodes would require a lot more shard movements, perhaps even wastefully undoing earlier movements. Moving a shard consumes resources such as I/O and network bandwidth and may evict data from the filesystem cache. These resources are better spent handling your searches and indexing where possible.
- A cluster with equal disk usage on every node typically performs no better than one that has unequal disk usage, as long as no disk is too full.
You can use the following settings to control disk-based allocation:
-
cluster.routing.allocation.disk.threshold_enabled
-
(Dynamic)
Defaults to
true
. Set tofalse
to disable the disk allocation decider.
-
cluster.routing.allocation.disk.watermark.low
-
(Dynamic)
Controls the low watermark for disk usage. It defaults to
85%
, meaning that Elasticsearch will not allocate shards to nodes that have more than 85% disk used. It can also be set to an absolute byte value (like500mb
) to prevent Elasticsearch from allocating shards if less than the specified amount of space is available. This setting has no effect on the primary shards of newly-created indices but will prevent their replicas from being allocated.
-
cluster.routing.allocation.disk.watermark.high
-
(Dynamic)
Controls the high watermark. It defaults to
90%
, meaning that Elasticsearch will attempt to relocate shards away from a node whose disk usage is above 90%. It can also be set to an absolute byte value (similarly to the low watermark) to relocate shards away from a node if it has less than the specified amount of free space. This setting affects the allocation of all shards, whether previously allocated or not. -
cluster.routing.allocation.disk.watermark.enable_for_single_data_node
-
(Static)
For a single data node, the default is to disregard disk watermarks when
making an allocation decision. This is deprecated behavior and will be
changed in 8.0. This setting can be set to
true
to enable the disk watermarks for a single data node cluster (will become default in 8.0).
-
cluster.routing.allocation.disk.watermark.flood_stage
-
(Dynamic) Controls the flood stage watermark, which defaults to 95%. Elasticsearch enforces a read-only index block (
index.blocks.read_only_allow_delete
) on every index that has one or more shards allocated on the node, and that has at least one disk exceeding the flood stage. This setting is a last resort to prevent nodes from running out of disk space. The index block is automatically released when the disk utilization falls below the high watermark.You cannot mix the usage of percentage values and byte values within these settings. Either all values are set to percentage values, or all are set to byte values. This enforcement is so that Elasticsearch can validate that the settings are internally consistent, ensuring that the low disk threshold is less than the high disk threshold, and the high disk threshold is less than the flood stage threshold.
An example of resetting the read-only index block on the
my-index-000001
index:PUT /my-index-000001/_settings { "index.blocks.read_only_allow_delete": null }
-
cluster.info.update.interval
-
(Dynamic)
How often Elasticsearch should check on disk usage for each node in the
cluster. Defaults to
30s
. -
cluster.routing.allocation.disk.include_relocations
-
[7.5.0]
Deprecated in 7.5.0. Future versions will always account for relocations.
Defaults to
true
, which means that Elasticsearch will take into account shards that are currently being relocated to the target node when computing a node’s disk usage. Taking relocating shards' sizes into account may, however, mean that the disk usage for a node is incorrectly estimated on the high side, since the relocation could be 90% complete and a recently retrieved disk usage would include the total size of the relocating shard as well as the space already used by the running relocation.
Percentage values refer to used disk space, while byte values refer to free disk space. This can be confusing, since it flips the meaning of high and low. For example, it makes sense to set the low watermark to 10gb and the high watermark to 5gb, but not the other way around.
An example of updating the low watermark to at least 100 gigabytes free, a high watermark of at least 50 gigabytes free, and a flood stage watermark of 10 gigabytes free, and updating the information about the cluster every minute:
PUT _cluster/settings { "transient": { "cluster.routing.allocation.disk.watermark.low": "100gb", "cluster.routing.allocation.disk.watermark.high": "50gb", "cluster.routing.allocation.disk.watermark.flood_stage": "10gb", "cluster.info.update.interval": "1m" } }
Shard allocation awareness
editYou can use custom node attributes as awareness attributes to enable Elasticsearch to take your physical hardware configuration into account when allocating shards. If Elasticsearch knows which nodes are on the same physical server, in the same rack, or in the same zone, it can distribute the primary shard and its replica shards to minimise the risk of losing all shard copies in the event of a failure.
When shard allocation awareness is enabled with the
dynamic
cluster.routing.allocation.awareness.attributes
setting, shards are only
allocated to nodes that have values set for the specified awareness attributes.
If you use multiple awareness attributes, Elasticsearch considers each attribute
separately when allocating shards.
By default Elasticsearch uses adaptive replica selection
to route search or GET requests. However, with the presence of allocation awareness
attributes Elasticsearch will prefer using shards in the same location (with the same
awareness attribute values) to process these requests. This behavior can be
disabled by specifying export ES_JAVA_OPTS="$ES_JAVA_OPTS -Des.search.ignore_awareness_attributes=true"
system property on every node that is part of the cluster.
The number of attribute values determines how many shard copies are allocated in each location. If the number of nodes in each location is unbalanced and there are a lot of replicas, replica shards might be left unassigned.
Enabling shard allocation awareness
editTo enable shard allocation awareness:
-
Specify the location of each node with a custom node attribute. For example, if you want Elasticsearch to distribute shards across different racks, you might set an awareness attribute called
rack_id
in each node’selasticsearch.yml
config file.node.attr.rack_id: rack_one
You can also set custom attributes when you start a node:
`./bin/elasticsearch -Enode.attr.rack_id=rack_one`
-
Tell Elasticsearch to take one or more awareness attributes into account when allocating shards by setting
cluster.routing.allocation.awareness.attributes
in every master-eligible node’selasticsearch.yml
config file.You can also use the cluster-update-settings API to set or update a cluster’s awareness attributes.
With this example configuration, if you start two nodes with
node.attr.rack_id
set to rack_one
and create an index with 5 primary
shards and 1 replica of each primary, all primaries and replicas are
allocated across the two nodes.
If you add two nodes with node.attr.rack_id
set to rack_two
,
Elasticsearch moves shards to the new nodes, ensuring (if possible)
that no two copies of the same shard are in the same rack.
If rack_two
fails and takes down both its nodes, by default Elasticsearch
allocates the lost shard copies to nodes in rack_one
. To prevent multiple
copies of a particular shard from being allocated in the same location, you can
enable forced awareness.
Forced awareness
editBy default, if one location fails, Elasticsearch assigns all of the missing replica shards to the remaining locations. While you might have sufficient resources across all locations to host your primary and replica shards, a single location might be unable to host ALL of the shards.
To prevent a single location from being overloaded in the event of a failure,
you can set cluster.routing.allocation.awareness.force
so no replicas are
allocated until nodes are available in another location.
For example, if you have an awareness attribute called zone
and configure nodes
in zone1
and zone2
, you can use forced awareness to prevent Elasticsearch
from allocating replicas if only one zone is available:
cluster.routing.allocation.awareness.attributes: zone cluster.routing.allocation.awareness.force.zone.values: zone1,zone2
With this example configuration, if you start two nodes with node.attr.zone
set
to zone1
and create an index with 5 shards and 1 replica, Elasticsearch creates
the index and allocates the 5 primary shards but no replicas. Replicas are
only allocated once nodes with node.attr.zone
set to zone2
are available.
Cluster-level shard allocation filtering
editYou can use cluster-level shard allocation filters to control where Elasticsearch allocates shards from any index. These cluster wide filters are applied in conjunction with per-index allocation filtering and allocation awareness.
Shard allocation filters can be based on custom node attributes or the built-in
_name
, _host_ip
, _publish_ip
, _ip
, _host
, _id
and _tier
attributes.
The cluster.routing.allocation
settings are dynamic, enabling live indices to
be moved from one set of nodes to another. Shards are only relocated if it is
possible to do so without breaking another routing constraint, such as never
allocating a primary and replica shard on the same node.
The most common use case for cluster-level shard allocation filtering is when you want to decommission a node. To move shards off of a node prior to shutting it down, you could create a filter that excludes the node by its IP address:
PUT _cluster/settings { "transient" : { "cluster.routing.allocation.exclude._ip" : "10.0.0.1" } }
Cluster routing settings
edit-
cluster.routing.allocation.include.{attribute}
-
(Dynamic)
Allocate shards to a node whose
{attribute}
has at least one of the comma-separated values. -
cluster.routing.allocation.require.{attribute}
-
(Dynamic)
Only allocate shards to a node whose
{attribute}
has all of the comma-separated values. -
cluster.routing.allocation.exclude.{attribute}
-
(Dynamic)
Do not allocate shards to a node whose
{attribute}
has any of the comma-separated values.
The cluster allocation settings support the following built-in attributes:
|
Match nodes by node name |
|
Match nodes by host IP address (IP associated with hostname) |
|
Match nodes by publish IP address |
|
Match either |
|
Match nodes by hostname |
|
Match nodes by node id |
|
Match nodes by the node’s data tier role |
You can use wildcards when specifying attribute values, for example:
PUT _cluster/settings { "transient": { "cluster.routing.allocation.exclude._ip": "192.168.2.*" } }
Miscellaneous cluster settings
editMetadata
editAn entire cluster may be set to read-only with the following setting:
-
cluster.blocks.read_only
- (Dynamic) Make the whole cluster read only (indices do not accept write operations), metadata is not allowed to be modified (create or delete indices).
-
cluster.blocks.read_only_allow_delete
-
(Dynamic)
Identical to
cluster.blocks.read_only
but allows to delete indices to free up resources.
Don’t rely on this setting to prevent changes to your cluster. Any user with access to the cluster-update-settings API can make the cluster read-write again.
Cluster shard limit
editThere is a soft limit on the number of shards in a cluster, based on the number of nodes in the cluster. This is intended to prevent operations which may unintentionally destabilize the cluster.
This limit is intended as a safety net, not a sizing recommendation. The exact number of shards your cluster can safely support depends on your hardware configuration and workload, but should remain well below this limit in almost all cases, as the default limit is set quite high.
If an operation, such as creating a new index, restoring a snapshot of an index, or opening a closed index would lead to the number of shards in the cluster going over this limit, the operation will fail with an error indicating the shard limit.
If the cluster is already over the limit, due to changes in node membership or setting changes, all operations that create or open indices will fail until either the limit is increased as described below, or some indices are closed or deleted to bring the number of shards below the limit.
The cluster shard limit defaults to 1,000 shards per data node. Both primary and replica shards of all open indices count toward the limit, including unassigned shards. For example, an open index with 5 primary shards and 2 replicas counts as 15 shards. Closed indices do not contribute to the shard count.
You can dynamically adjust the cluster shard limit with the following setting:
-
cluster.max_shards_per_node
-
(Dynamic) Limits the total number of primary and replica shards for the cluster. Elasticsearch calculates the limit as follows:
cluster.max_shards_per_node * number of data nodes
Shards for closed indices do not count toward this limit. Defaults to
1000
. A cluster with no data nodes is unlimited.Elasticsearch rejects any request that creates more shards than this limit allows. For example, a cluster with a
cluster.max_shards_per_node
setting of100
and three data nodes has a shard limit of 300. If the cluster already contains 296 shards, Elasticsearch rejects any request that adds five or more shards to the cluster.This setting does not limit shards for individual nodes. To limit the number of shards for each node, use the
cluster.routing.allocation.total_shards_per_node
setting.
User-defined cluster metadata
editUser-defined metadata can be stored and retrieved using the Cluster Settings API.
This can be used to store arbitrary, infrequently-changing data about the cluster
without the need to create an index to store it. This data may be stored using
any key prefixed with cluster.metadata.
. For example, to store the email
address of the administrator of a cluster under the key cluster.metadata.administrator
,
issue this request:
PUT /_cluster/settings { "persistent": { "cluster.metadata.administrator": "sysadmin@example.com" } }
User-defined cluster metadata is not intended to store sensitive or confidential information. Any information stored in user-defined cluster metadata will be viewable by anyone with access to the Cluster Get Settings API, and is recorded in the Elasticsearch logs.
Index tombstones
editThe cluster state maintains index tombstones to explicitly denote indices that have been deleted. The number of tombstones maintained in the cluster state is controlled by the following setting:
-
cluster.indices.tombstones.size
-
(Static)
Index tombstones prevent nodes that are not part of the cluster when a delete
occurs from joining the cluster and reimporting the index as though the delete
was never issued. To keep the cluster state from growing huge we only keep the
last
cluster.indices.tombstones.size
deletes, which defaults to 500. You can increase it if you expect nodes to be absent from the cluster and miss more than 500 deletes. We think that is rare, thus the default. Tombstones don’t take up much space, but we also think that a number like 50,000 is probably too big.
If Elasticsearch encounters index data that is absent from the current cluster
state, those indices are considered to be dangling. For example,
this can happen if you delete more than
cluster.indices.tombstones.size
indices while an Elasticsearch node is offline.
You can use the Dangling indices API to manage this situation.
Logger
editThe settings which control logging can be updated dynamically with the
logger.
prefix. For instance, to increase the logging level of the
indices.recovery
module to DEBUG
, issue this request:
PUT /_cluster/settings { "transient": { "logger.org.elasticsearch.indices.recovery": "DEBUG" } }
Persistent tasks allocation
editPlugins can create a kind of tasks called persistent tasks. Those tasks are usually long-lived tasks and are stored in the cluster state, allowing the tasks to be revived after a full cluster restart.
Every time a persistent task is created, the master node takes care of assigning the task to a node of the cluster, and the assigned node will then pick up the task and execute it locally. The process of assigning persistent tasks to nodes is controlled by the following settings:
-
cluster.persistent_tasks.allocation.enable
-
(Dynamic) Enable or disable allocation for persistent tasks:
-
all
- (default) Allows persistent tasks to be assigned to nodes -
none
- No allocations are allowed for any type of persistent task
This setting does not affect the persistent tasks that are already being executed. Only newly created persistent tasks, or tasks that must be reassigned (after a node left the cluster, for example), are impacted by this setting.
-
-
cluster.persistent_tasks.allocation.recheck_interval
- (Dynamic) The master node will automatically check whether persistent tasks need to be assigned when the cluster state changes significantly. However, there may be other factors, such as memory usage, that affect whether persistent tasks can be assigned to nodes but do not cause the cluster state to change. This setting controls how often assignment checks are performed to react to these factors. The default is 30 seconds. The minimum permitted value is 10 seconds.
On this page
- Cluster-level shard allocation settings
- Shard rebalancing settings
- Shard balancing heuristics settings
- Disk-based shard allocation settings
- Shard allocation awareness
- Enabling shard allocation awareness
- Forced awareness
- Cluster-level shard allocation filtering
- Cluster routing settings
- Miscellaneous cluster settings
- Metadata
- Cluster shard limit
- User-defined cluster metadata
- Index tombstones
- Logger
- Persistent tasks allocation