Elasticsearch Guide: other versions:
Getting Started
- Basic Concepts
- Installation
- Exploring Your Cluster
- Modifying Your Data
- Exploring Your Data
- Conclusion
Setup Elasticsearch
- Installing Elasticsearch
- Configuring Elasticsearch
- Important Elasticsearch configuration
- Secure Settings
- Bootstrap Checks
- Important System Configuration
- Upgrading Elasticsearch
- Stopping Elasticsearch
Breaking changes
- Breaking changes in 5.3
- Breaking changes in 5.2
  - Shadow Replicas are deprecated
- Breaking changes in 5.1
- Breaking changes in 5.0
API Conventions
- Multiple Indices
- Date math support in index names
- Common options
- URL-based access control
Document APIs
- Reading and Writing documents
- Index API
- Get API
- Delete API
- Delete By Query API
- Update API
- Update By Query API
- Multi Get API
- Bulk API
- Reindex API
- Term Vectors
- Multi termvectors API
- ?refresh
Search APIs
- Search
- URI Search
- Request Body Search
  - Query
  - From / Size
  - Sort
  - Source filtering
  - Fields
  - Script Fields
  - Doc value Fields
  - Post filter
  - Highlighting
  - Rescoring
  - Search Type
  - Scroll
  - Preference
  - Explain
  - Version
  - Index Boost
  - min_score
  - Named Queries
  - Inner hits
  - Field Collapsing
  - Search After
- Search Template
- Multi Search Template
- Search Shards API
- Suggesters
- Multi Search API
- Count API
- Validate API
- Explain API
- Profile API
- Percolator
- Field stats API
Aggregations
- Metrics Aggregations
- Bucket Aggregations
- Pipeline Aggregations
- Matrix Aggregations
  - Matrix Stats
- Caching heavy aggregations
- Returning only aggregation results
- Aggregation Metadata
Indices APIs
- Create Index
- Delete Index
- Get Index
- Indices Exists
- Open / Close Index API
- Shrink Index
- Rollover Index
- Put Mapping
- Get Mapping
- Get Field Mapping
- Types Exists
- Index Aliases
- Update Indices Settings
- Get Settings
- Analyze
  - Explain Analyze
- Index Templates
- Shadow replica indices
  - Node level settings related to shadow replicas
- Indices Stats
- Indices Segments
- Indices Recovery
- Indices Shard Stores
- Clear Cache
- Flush
  - Synced Flush
- Refresh
- Force Merge
cat APIs
- cat aliases
- cat allocation
- cat count
- cat fielddata
- cat health
- cat indices
- cat master
- cat nodeattrs
- cat nodes
- cat pending tasks
- cat plugins
- cat recovery
- cat repositories
- cat thread pool
- cat shards
- cat segments
- cat snapshots
- cat templates
Cluster APIs
- Cluster Health
- Cluster State
- Cluster Stats
- Pending cluster tasks
- Cluster Reroute
- Cluster Update Settings
- Nodes Stats
- Nodes Info
- Task Management API
- Nodes hot_threads
- Cluster Allocation Explain API
Query DSL
- Query and filter context
- Match All Query
- Full text queries
- Term level queries
- Compound queries
- Joining queries
- Geo queries
- Specialized queries
- Span queries
- Minimum Should Match
- Multi Term Query Rewrite
Mapping
- Field datatypes
- Meta-Fields
- Mapping parameters
- Dynamic Mapping
Analysis
- Anatomy of an analyzer
- Testing analyzers
- Analyzers
- Normalizers
- Tokenizers
- Token Filters
- Character Filters
Modules
- Cluster
- Discovery
- Local Gateway
- HTTP
- Indices
- Network Settings
- Node
- Plugins
- Scripting
- Snapshot And Restore
- Thread Pool
- Transport
- Tribe node
- Cross Cluster Search
Index Modules
- Analysis
- Index Shard Allocation
- Mapper
- Merge
- Similarity module
- Slow Log
- Store
  - Pre-loading data into the file system cache
- Translog
Ingest Node
- Pipeline Definition
- Ingest APIs
- Accessing Data in Pipelines
- Handling Failures in Pipelines
- Processors
How To
- General recommendations
- Recipes
- Tune for indexing speed
- Tune for search speed
- Tune for disk usage
Testing
- Java Testing Framework
Glossary of terms
Release Notes
- 5.3.3 Release Notes
- 5.3.2 Release Notes
- 5.3.1 Release Notes
- 5.3.0 Release Notes
- 5.2.2 Release Notes
- 5.2.1 Release Notes
- 5.2.0 Release Notes
- 5.1.2 Release Notes
- 5.1.1 Release Notes
- 5.1.0 Release Notes
- 5.0.2 Release Notes
- 5.0.1 Release Notes
- 5.0.0 Combined Release Notes
- 5.0.0 GA Release Notes
- 5.0.0-rc1 Release Notes
- 5.0.0-beta1 Release Notes
- 5.0.0-alpha5 Release Notes
- 5.0.0-alpha4 Release Notes
- 5.0.0-alpha3 Release Notes
- 5.0.0-alpha2 Release Notes
- 5.0.0-alpha1 Release Notes
- 5.0.0-alpha1 Release Notes (Changes previously released in 2.x)
Painless API Reference

IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

« Index Templates Node level settings related to shadow replicas »

› ›

Shadow replica indices

edit

Shadow replica indices

edit

Deprecated in 5.2.0.

Shadow replicas don’t see much usage and we are planning to remove them

If you would like to use a shared filesystem, you can use the shadow replicas settings to choose where on disk the data for an index should be kept, as well as how Elasticsearch should replay operations on all the replica shards of an index.

In order to fully utilize the index.data_path and index.shadow_replicas settings, you need to allow Elasticsearch to use the same data directory for multiple instances by setting node.add_lock_id_to_custom_path to false in elasticsearch.yml:

node.add_lock_id_to_custom_path: false

You will also need to indicate to the security manager where the custom indices will be, so that the correct permissions can be applied. You can do this by setting the path.shared_data setting in elasticsearch.yml:

path.shared_data: /opt/data

This means that Elasticsearch can read and write to files in any subdirectory of the path.shared_data setting.

You can then create an index with a custom data path, where each node will use this path for the data:

Because shadow replicas do not index the document on replica shards, it’s possible for the replica’s known mapping to be behind the index’s known mapping if the latest cluster state has not yet been processed on the node containing the replica. Because of this, it is highly recommended to use pre-defined mappings when using shadow replicas.

curl -XPUT 'localhost:9200/my_index' -H 'Content-Type: application/json' -d '
{
    "index" : {
        "number_of_shards" : 1,
        "number_of_replicas" : 4,
        "data_path": "/opt/data/my_index",
        "shadow_replicas": true
    }
}'

In the above example, the "/opt/data/my_index" path is a shared filesystem that must be available on every node in the Elasticsearch cluster. You must also ensure that the Elasticsearch process has the correct permissions to read from and write to the directory used in the index.data_path setting.

The data_path does not have to contain the index name, in this case, "my_index" was used but it could easily also have been "/opt/data/"

An index that has been created with the index.shadow_replicas setting set to "true" will not replicate document operations to any of the replica shards, instead, it will only continually refresh. Once segments are available on the filesystem where the shadow replica resides (after an Elasticsearch "flush"), a regular refresh (governed by the index.refresh_interval) can be used to make the new data searchable.

Since documents are only indexed on the primary shard, realtime GET requests could fail to return a document if executed on the replica shard, therefore, GET API requests automatically have the ?preference=_primary flag set if there is no preference flag already set.

In order to ensure the data is being synchronized in a fast enough manner, you may need to tune the flush threshold for the index to a desired number. A flush is needed to fsync segment files to disk, so they will be visible to all other replica nodes. Users should test what flush threshold levels they are comfortable with, as increased flushing can impact indexing performance.

The Elasticsearch cluster will still detect the loss of a primary shard, and transform the replica into a primary in this situation. This transformation will take slightly longer, since no IndexWriter is maintained for each shadow replica.

Below is the list of settings that can be changed using the update settings API:

index.data_path (string): Path to use for the index’s data. Note that by default Elasticsearch will append the node ordinal by default to the path to ensure multiple instances of Elasticsearch on the same machine do not share a data directory.
index.shadow_replicas: Boolean value indicating this index should use shadow replicas. Defaults to false.
index.shared_filesystem: Boolean value indicating this index uses a shared filesystem. Defaults to the true if index.shadow_replicas is set to true, false otherwise.
index.shared_filesystem.recover_on_any_node: Boolean value indicating whether the primary shards for the index should be allowed to recover on any node in the cluster. If a node holding a copy of the shard is found, recovery prefers that node. Defaults to false.

« Index Templates Node level settings related to shadow replicas »

Was this helpful?

Feedback

The Search AI Company

ELK Stack

Elastic Cloud

Generative AI

Search

Security

Observability

By solution

Industries

Customer spotlight

Research

Build

Learn

Connect

Shadow replica indices

Shadow replica indices

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards