- Elasticsearch Guide: other versions:
- Elasticsearch introduction
- Getting started with Elasticsearch
- Set up Elasticsearch
- Installing Elasticsearch
- Configuring Elasticsearch
- Important Elasticsearch configuration
- Important System Configuration
- Bootstrap Checks
- Heap size check
- File descriptor check
- Memory lock check
- Maximum number of threads check
- Max file size check
- Maximum size virtual memory check
- Maximum map count check
- Client JVM check
- Use serial collector check
- System call filter check
- OnError and OnOutOfMemoryError checks
- Early-access check
- G1GC check
- All permission check
- Starting Elasticsearch
- Stopping Elasticsearch
- Adding nodes to your cluster
- Installing X-Pack
- Set up X-Pack
- Configuring X-Pack Java Clients
- X-Pack Settings
- Bootstrap Checks for X-Pack
- Upgrade Elasticsearch
- API Conventions
- Document APIs
- Search APIs
- Aggregations
- Metrics Aggregations
- Avg Aggregation
- Weighted Avg Aggregation
- Cardinality Aggregation
- Extended Stats Aggregation
- Geo Bounds Aggregation
- Geo Centroid Aggregation
- Max Aggregation
- Min Aggregation
- Percentiles Aggregation
- Percentile Ranks Aggregation
- Scripted Metric Aggregation
- Stats Aggregation
- Sum Aggregation
- Top Hits Aggregation
- Value Count Aggregation
- Median Absolute Deviation Aggregation
- Bucket Aggregations
- Adjacency Matrix Aggregation
- Auto-interval Date Histogram Aggregation
- Children Aggregation
- Composite Aggregation
- Date Histogram Aggregation
- Date Range Aggregation
- Diversified Sampler Aggregation
- Filter Aggregation
- Filters Aggregation
- Geo Distance Aggregation
- GeoHash grid Aggregation
- Global Aggregation
- Histogram Aggregation
- IP Range Aggregation
- Missing Aggregation
- Nested Aggregation
- Parent Aggregation
- Range Aggregation
- Reverse nested Aggregation
- Sampler Aggregation
- Significant Terms Aggregation
- Significant Text Aggregation
- Terms Aggregation
- Pipeline Aggregations
- Avg Bucket Aggregation
- Derivative Aggregation
- Max Bucket Aggregation
- Min Bucket Aggregation
- Sum Bucket Aggregation
- Stats Bucket Aggregation
- Extended Stats Bucket Aggregation
- Percentiles Bucket Aggregation
- Moving Average Aggregation
- Moving Function Aggregation
- Cumulative Sum Aggregation
- Bucket Script Aggregation
- Bucket Selector Aggregation
- Bucket Sort Aggregation
- Serial Differencing Aggregation
- Matrix Aggregations
- Caching heavy aggregations
- Returning only aggregation results
- Aggregation Metadata
- Returning the type of the aggregation
- Metrics Aggregations
- Indices APIs
- Create Index
- Delete Index
- Get Index
- Indices Exists
- Open / Close Index API
- Shrink Index
- Split Index
- Rollover Index
- Put Mapping
- Get Mapping
- Get Field Mapping
- Types Exists
- Index Aliases
- Update Indices Settings
- Get Settings
- Analyze
- Index Templates
- Indices Stats
- Indices Segments
- Indices Recovery
- Indices Shard Stores
- Clear Cache
- Flush
- Refresh
- Force Merge
- cat APIs
- Cluster APIs
- Query DSL
- Scripting
- Mapping
- Analysis
- Anatomy of an analyzer
- Testing analyzers
- Analyzers
- Normalizers
- Tokenizers
- Standard Tokenizer
- Letter Tokenizer
- Lowercase Tokenizer
- Whitespace Tokenizer
- UAX URL Email Tokenizer
- Classic Tokenizer
- Thai Tokenizer
- NGram Tokenizer
- Edge NGram Tokenizer
- Keyword Tokenizer
- Pattern Tokenizer
- Char Group Tokenizer
- Simple Pattern Tokenizer
- Simple Pattern Split Tokenizer
- Path Hierarchy Tokenizer
- Path Hierarchy Tokenizer Examples
- Token Filters
- Standard Token Filter
- ASCII Folding Token Filter
- Flatten Graph Token Filter
- Length Token Filter
- Lowercase Token Filter
- Uppercase Token Filter
- NGram Token Filter
- Edge NGram Token Filter
- Porter Stem Token Filter
- Shingle Token Filter
- Stop Token Filter
- Word Delimiter Token Filter
- Word Delimiter Graph Token Filter
- Multiplexer Token Filter
- Conditional Token Filter
- Predicate Token Filter Script
- Stemmer Token Filter
- Stemmer Override Token Filter
- Keyword Marker Token Filter
- Keyword Repeat Token Filter
- KStem Token Filter
- Snowball Token Filter
- Phonetic Token Filter
- Synonym Token Filter
- Parsing synonym files
- Synonym Graph Token Filter
- Compound Word Token Filters
- Reverse Token Filter
- Elision Token Filter
- Truncate Token Filter
- Unique Token Filter
- Pattern Capture Token Filter
- Pattern Replace Token Filter
- Trim Token Filter
- Limit Token Count Token Filter
- Hunspell Token Filter
- Common Grams Token Filter
- Normalization Token Filter
- CJK Width Token Filter
- CJK Bigram Token Filter
- Delimited Payload Token Filter
- Keep Words Token Filter
- Keep Types Token Filter
- Exclude mode settings example
- Classic Token Filter
- Apostrophe Token Filter
- Decimal Digit Token Filter
- Fingerprint Token Filter
- MinHash Token Filter
- Remove Duplicates Token Filter
- Character Filters
- Modules
- Index Modules
- Ingest Node
- Pipeline Definition
- Ingest APIs
- Accessing Data in Pipelines
- Conditional Execution in Pipelines
- Handling Failures in Pipelines
- Processors
- Append Processor
- Bytes Processor
- Convert Processor
- Date Processor
- Date Index Name Processor
- Dissect Processor
- Dot Expander Processor
- Drop Processor
- Fail Processor
- Foreach Processor
- GeoIP Processor
- Grok Processor
- Gsub Processor
- Join Processor
- JSON Processor
- KV Processor
- Lowercase Processor
- Pipeline Processor
- Remove Processor
- Rename Processor
- Script Processor
- Set Processor
- Set Security User Processor
- Split Processor
- Sort Processor
- Trim Processor
- Uppercase Processor
- URL Decode Processor
- User Agent processor
- Managing the index lifecycle
- SQL Access
- Monitor a cluster
- Rolling up historical data
- Frozen indices
- Set up a cluster for high availability
- Secure a cluster
- Overview
- Configuring security
- Encrypting communications in Elasticsearch
- Encrypting communications in an Elasticsearch Docker Container
- Enabling cipher suites for stronger encryption
- Separating node-to-node and client traffic
- Configuring an Active Directory realm
- Configuring a file realm
- Configuring an LDAP realm
- Configuring a native realm
- Configuring a PKI realm
- Configuring a SAML realm
- Configuring a Kerberos realm
- FIPS 140-2
- Security settings
- Security files
- Auditing Settings
- How security works
- User authentication
- Built-in users
- Internal users
- Token-based authentication services
- Realms
- Realm chains
- Active Directory user authentication
- File-based user authentication
- LDAP user authentication
- Native user authentication
- PKI user authentication
- SAML authentication
- Kerberos authentication
- Integrating with other authentication systems
- Enabling anonymous access
- Controlling the user cache
- Configuring SAML single-sign-on on the Elastic Stack
- User authorization
- Auditing security events
- Encrypting communications
- Restricting connections with IP filtering
- Cross cluster search, tribe, clients, and integrations
- Tutorial: Getting started with security
- Tutorial: Encrypting communications
- Troubleshooting
- Can’t log in after upgrading to 6.8.23
- Some settings are not returned via the nodes settings API
- Authorization exceptions
- Users command fails due to extra arguments
- Users are frequently locked out of Active Directory
- Certificate verification fails for curl on Mac
- SSLHandshakeException causes connections to fail
- Common SSL/TLS exceptions
- Common Kerberos exceptions
- Common SAML issues
- Internal Server Error in Kibana
- Setup-passwords command fails due to connection failure
- Failures due to relocation of the configuration files
- Limitations
- Alerting on Cluster and Index Events
- Command line tools
- How To
- Glossary of terms
- X-Pack APIs
- Info API
- Cross-cluster replication APIs
- Explore API
- Freeze index
- Index lifecycle management API
- Licensing APIs
- Migration APIs
- Machine learning APIs
- Add events to calendar
- Add jobs to calendar
- Close jobs
- Create calendar
- Create datafeeds
- Create filter
- Create jobs
- Delete calendar
- Delete datafeeds
- Delete events from calendar
- Delete filter
- Delete forecast
- Delete jobs
- Delete jobs from calendar
- Delete model snapshots
- Delete expired data
- Find file structure
- Flush jobs
- Forecast jobs
- Get calendars
- Get buckets
- Get overall buckets
- Get categories
- Get datafeeds
- Get datafeed statistics
- Get influencers
- Get jobs
- Get job statistics
- Get machine learning info
- Get model snapshots
- Get scheduled events
- Get filters
- Get records
- Open jobs
- Post data to jobs
- Preview datafeeds
- Revert model snapshots
- Set upgrade mode
- Start datafeeds
- Stop datafeeds
- Update datafeeds
- Update filter
- Update jobs
- Update model snapshots
- Rollup APIs
- Security APIs
- Authenticate
- Change passwords
- Clear cache
- Clear roles cache
- Create API keys
- Create or update application privileges
- Create or update role mappings
- Create or update roles
- Create or update users
- Delete application privileges
- Delete role mappings
- Delete roles
- Delete users
- Disable users
- Enable users
- Get API key information
- Get application privileges
- Get role mappings
- Get roles
- Get token
- Get users
- Has privileges
- Invalidate API key
- Invalidate token
- SSL certificate
- Unfreeze index
- Watcher APIs
- Definitions
- Release Highlights
- Breaking changes
- Release Notes
- Elasticsearch version 6.8.23
- Elasticsearch version 6.8.22
- Elasticsearch version 6.8.21
- Elasticsearch version 6.8.20
- Elasticsearch version 6.8.19
- Elasticsearch version 6.8.18
- Elasticsearch version 6.8.17
- Elasticsearch version 6.8.16
- Elasticsearch version 6.8.15
- Elasticsearch version 6.8.14
- Elasticsearch version 6.8.13
- Elasticsearch version 6.8.12
- Elasticsearch version 6.8.11
- Elasticsearch version 6.8.10
- Elasticsearch version 6.8.9
- Elasticsearch version 6.8.8
- Elasticsearch version 6.8.7
- Elasticsearch version 6.8.6
- Elasticsearch version 6.8.5
- Elasticsearch version 6.8.4
- Elasticsearch version 6.8.3
- Elasticsearch version 6.8.2
- Elasticsearch version 6.8.1
- Elasticsearch version 6.8.0
- Elasticsearch version 6.7.2
- Elasticsearch version 6.7.1
- Elasticsearch version 6.7.0
- Elasticsearch version 6.6.2
- Elasticsearch version 6.6.1
- Elasticsearch version 6.6.0
- Elasticsearch version 6.5.4
- Elasticsearch version 6.5.3
- Elasticsearch version 6.5.2
- Elasticsearch version 6.5.1
- Elasticsearch version 6.5.0
- Elasticsearch version 6.4.3
- Elasticsearch version 6.4.2
- Elasticsearch version 6.4.1
- Elasticsearch version 6.4.0
- Elasticsearch version 6.3.2
- Elasticsearch version 6.3.1
- Elasticsearch version 6.3.0
- Elasticsearch version 6.2.4
- Elasticsearch version 6.2.3
- Elasticsearch version 6.2.2
- Elasticsearch version 6.2.1
- Elasticsearch version 6.2.0
- Elasticsearch version 6.1.4
- Elasticsearch version 6.1.3
- Elasticsearch version 6.1.2
- Elasticsearch version 6.1.1
- Elasticsearch version 6.1.0
- Elasticsearch version 6.0.1
- Elasticsearch version 6.0.0
- Elasticsearch version 6.0.0-rc2
- Elasticsearch version 6.0.0-rc1
- Elasticsearch version 6.0.0-beta2
- Elasticsearch version 6.0.0-beta1
- Elasticsearch version 6.0.0-alpha2
- Elasticsearch version 6.0.0-alpha1
- Elasticsearch version 6.0.0-alpha1 (Changes previously released in 5.x)
NOTE: You are looking at documentation for an older release. For the latest information, see the current release documentation.
Shrink Index
editShrink Index
editThe shrink index API allows you to shrink an existing index into a new index
with fewer primary shards. The requested number of primary shards in the target index
must be a factor of the number of shards in the source index. For example an index with
8
primary shards can be shrunk into 4
, 2
or 1
primary shards or an index
with 15
primary shards can be shrunk into 5
, 3
or 1
. If the number
of shards in the index is a prime number it can only be shrunk into a single
primary shard. Before shrinking, a (primary or replica) copy of every shard
in the index must be present on the same node.
Shrinking works as follows:
- First, it creates a new target index with the same definition as the source index, but with a smaller number of primary shards.
- Then it hard-links segments from the source index into the target index. (If the file system doesn’t support hard-linking, then all segments are copied into the new index, which is a much more time consuming process. Also if using multiple data paths, shards on different data paths require a full copy of segment files if they are not on the same disk since hardlinks don’t work across disks)
- Finally, it recovers the target index as though it were a closed index which had just been re-opened.
Preparing an index for shrinking
editIn order to shrink an index, the index must be marked as read-only, and a
(primary or replica) copy of every shard in the index must be relocated to the
same node and have health green
.
These two conditions can be achieved with the following request:
PUT /my_source_index/_settings { "settings": { "index.routing.allocation.require._name": "shrink_node_name", "index.blocks.write": true } }
Forces the relocation of a copy of each shard to the node with name
|
|
Prevents write operations to this index while still allowing metadata changes like deleting the index. |
It can take a while to relocate the source index. Progress can be tracked
with the _cat recovery
API, or the cluster health
API can be used to wait until all shards have relocated
with the wait_for_no_relocating_shards
parameter.
Shrinking an index
editTo shrink my_source_index
into a new index called my_target_index
, issue
the following request:
POST my_source_index/_shrink/my_target_index?copy_settings=true { "settings": { "index.routing.allocation.require._name": null, "index.blocks.write": null } }
Clear the allocation requirement copied from the source index. |
|
Clear the index write block copied from the source index. |
The above request returns immediately once the target index has been added to the cluster state — it doesn’t wait for the shrink operation to start.
Indices can only be shrunk if they satisfy the following requirements:
- The target index must not exist.
- The source index must have more primary shards than the target index.
- The number of primary shards in the target index must be a factor of the number of primary shards in the source index. The source index must have more primary shards than the target index.
-
The index must not contain more than
2,147,483,519
documents in total across all shards that will be shrunk into a single shard on the target index as this is the maximum number of docs that can fit into a single shard. - The node handling the shrink process must have sufficient free disk space to accommodate a second copy of the existing index.
The _shrink
API is similar to the create index
API
and accepts settings
and aliases
parameters for the target index:
POST my_source_index/_shrink/my_target_index?copy_settings=true { "settings": { "index.number_of_replicas": 1, "index.number_of_shards": 1, "index.codec": "best_compression" }, "aliases": { "my_search_indices": {} } }
The number of shards in the target index. This must be a factor of the number of shards in the source index. |
|
Best compression will only take affect when new writes are made to the index, such as when force-merging the shard to a single segment. |
Mappings may not be specified in the _shrink
request.
By default, with the exception of index.analysis
, index.similarity
,
and index.sort
settings, index settings on the source index are not copied
during a shrink operation. With the exception of non-copyable settings, settings
from the source index can be copied to the target index by adding the URL
parameter copy_settings=true
to the request. Note that copy_settings
can not
be set to false
. The parameter copy_settings
will be removed in 8.0.0
[6.4.0] Deprecated in 6.4.0. not copying settings is deprecated, copying settings will be the default behavior in 7.x
Monitoring the shrink process
editThe shrink process can be monitored with the _cat recovery
API, or the cluster health
API can be used to wait
until all primary shards have been allocated by setting the wait_for_status
parameter to yellow
.
The _shrink
API returns as soon as the target index has been added to the
cluster state, before any shards have been allocated. At this point, all
shards are in the state unassigned
. If, for any reason, the target index
can’t be allocated on the shrink node, its primary shard will remain
unassigned
until it can be allocated on that node.
Once the primary shard is allocated, it moves to state initializing
, and the
shrink process begins. When the shrink operation completes, the shard will
become active
. At that point, Elasticsearch will try to allocate any
replicas and may decide to relocate the primary shard to another node.
Wait For Active Shards
editBecause the shrink operation creates a new index to shrink the shards to, the wait for active shards setting on index creation applies to the shrink index action as well.
On this page