- Elasticsearch Guide: other versions:
- Getting Started
- Set up Elasticsearch
- Installing Elasticsearch
- Configuring Elasticsearch
- Important Elasticsearch configuration
- Important System Configuration
- Bootstrap Checks
- Heap size check
- File descriptor check
- Memory lock check
- Maximum number of threads check
- Max file size check
- Maximum size virtual memory check
- Maximum map count check
- Client JVM check
- Use serial collector check
- System call filter check
- OnError and OnOutOfMemoryError checks
- Early-access check
- G1GC check
- All permission check
- Starting Elasticsearch
- Stopping Elasticsearch
- Adding nodes to your cluster
- Installing X-Pack
- Set up X-Pack
- Configuring X-Pack Java Clients
- X-Pack Settings
- Bootstrap Checks for X-Pack
- Upgrade Elasticsearch
- API Conventions
- Document APIs
- Search APIs
- Aggregations
- Metrics Aggregations
- Avg Aggregation
- Weighted Avg Aggregation
- Cardinality Aggregation
- Extended Stats Aggregation
- Geo Bounds Aggregation
- Geo Centroid Aggregation
- Max Aggregation
- Min Aggregation
- Percentiles Aggregation
- Percentile Ranks Aggregation
- Scripted Metric Aggregation
- Stats Aggregation
- Sum Aggregation
- Top Hits Aggregation
- Value Count Aggregation
- Bucket Aggregations
- Adjacency Matrix Aggregation
- Children Aggregation
- Composite Aggregation
- Date Histogram Aggregation
- Date Range Aggregation
- Diversified Sampler Aggregation
- Filter Aggregation
- Filters Aggregation
- Geo Distance Aggregation
- GeoHash grid Aggregation
- Global Aggregation
- Histogram Aggregation
- IP Range Aggregation
- Missing Aggregation
- Nested Aggregation
- Range Aggregation
- Reverse nested Aggregation
- Sampler Aggregation
- Significant Terms Aggregation
- Significant Text Aggregation
- Terms Aggregation
- Pipeline Aggregations
- Avg Bucket Aggregation
- Derivative Aggregation
- Max Bucket Aggregation
- Min Bucket Aggregation
- Sum Bucket Aggregation
- Stats Bucket Aggregation
- Extended Stats Bucket Aggregation
- Percentiles Bucket Aggregation
- Moving Average Aggregation
- Moving Function Aggregation
- Cumulative Sum Aggregation
- Bucket Script Aggregation
- Bucket Selector Aggregation
- Bucket Sort Aggregation
- Serial Differencing Aggregation
- Matrix Aggregations
- Caching heavy aggregations
- Returning only aggregation results
- Aggregation Metadata
- Returning the type of the aggregation
- Metrics Aggregations
- Indices APIs
- Create Index
- Delete Index
- Get Index
- Indices Exists
- Open / Close Index API
- Shrink Index
- Split Index
- Rollover Index
- Put Mapping
- Get Mapping
- Get Field Mapping
- Types Exists
- Index Aliases
- Update Indices Settings
- Get Settings
- Analyze
- Index Templates
- Indices Stats
- Indices Segments
- Indices Recovery
- Indices Shard Stores
- Clear Cache
- Flush
- Refresh
- Force Merge
- cat APIs
- Cluster APIs
- Query DSL
- Mapping
- Analysis
- Anatomy of an analyzer
- Testing analyzers
- Analyzers
- Normalizers
- Tokenizers
- Standard Tokenizer
- Letter Tokenizer
- Lowercase Tokenizer
- Whitespace Tokenizer
- UAX URL Email Tokenizer
- Classic Tokenizer
- Thai Tokenizer
- NGram Tokenizer
- Edge NGram Tokenizer
- Keyword Tokenizer
- Pattern Tokenizer
- Char Group Tokenizer
- Simple Pattern Tokenizer
- Simple Pattern Split Tokenizer
- Path Hierarchy Tokenizer
- Path Hierarchy Tokenizer Examples
- Token Filters
- Standard Token Filter
- ASCII Folding Token Filter
- Flatten Graph Token Filter
- Length Token Filter
- Lowercase Token Filter
- Uppercase Token Filter
- NGram Token Filter
- Edge NGram Token Filter
- Porter Stem Token Filter
- Shingle Token Filter
- Stop Token Filter
- Word Delimiter Token Filter
- Word Delimiter Graph Token Filter
- Multiplexer Token Filter
- Stemmer Token Filter
- Stemmer Override Token Filter
- Keyword Marker Token Filter
- Keyword Repeat Token Filter
- KStem Token Filter
- Snowball Token Filter
- Phonetic Token Filter
- Synonym Token Filter
- Synonym Graph Token Filter
- Compound Word Token Filters
- Reverse Token Filter
- Elision Token Filter
- Truncate Token Filter
- Unique Token Filter
- Pattern Capture Token Filter
- Pattern Replace Token Filter
- Trim Token Filter
- Limit Token Count Token Filter
- Hunspell Token Filter
- Common Grams Token Filter
- Normalization Token Filter
- CJK Width Token Filter
- CJK Bigram Token Filter
- Delimited Payload Token Filter
- Keep Words Token Filter
- Keep Types Token Filter
- Exclude mode settings example
- Classic Token Filter
- Apostrophe Token Filter
- Decimal Digit Token Filter
- Fingerprint Token Filter
- Minhash Token Filter
- Remove Duplicates Token Filter
- Character Filters
- Modules
- Index Modules
- Ingest Node
- Pipeline Definition
- Ingest APIs
- Accessing Data in Pipelines
- Handling Failures in Pipelines
- Processors
- Append Processor
- Bytes Processor
- Convert Processor
- Date Processor
- Date Index Name Processor
- Fail Processor
- Foreach Processor
- Grok Processor
- Gsub Processor
- Join Processor
- JSON Processor
- KV Processor
- Lowercase Processor
- Remove Processor
- Rename Processor
- Script Processor
- Set Processor
- Split Processor
- Sort Processor
- Trim Processor
- Uppercase Processor
- Dot Expander Processor
- URL Decode Processor
- SQL Access
- Monitor a cluster
- Rolling up historical data
- Secure a cluster
- Overview
- Configuring Security
- Encrypting communications in Elasticsearch
- Encrypting Communications in an Elasticsearch Docker Container
- Enabling cipher suites for stronger encryption
- Separating node-to-node and client traffic
- Configuring an Active Directory realm
- Configuring a file realm
- Configuring an LDAP realm
- Configuring a native realm
- Configuring a PKI realm
- Configuring a SAML realm
- Configuring a Kerberos realm
- FIPS 140-2
- Security settings
- Auditing settings
- Getting started with security
- How security works
- User authentication
- Built-in users
- Internal users
- Realms
- Active Directory user authentication
- File-based user authentication
- LDAP user authentication
- Native user authentication
- PKI user authentication
- SAML authentication
- Kerberos authentication
- Integrating with other authentication systems
- Enabling anonymous access
- Controlling the user cache
- Configuring SAML single-sign-on on the Elastic Stack
- User authorization
- Auditing security events
- Encrypting communications
- Restricting connections with IP filtering
- Cross cluster search, tribe, clients, and integrations
- Reference
- Troubleshooting
- Can’t log in after upgrading to 6.4.3
- Some settings are not returned via the nodes settings API
- Authorization exceptions
- Users command fails due to extra arguments
- Users are frequently locked out of Active Directory
- Certificate verification fails for curl on Mac
- SSLHandshakeException causes connections to fail
- Common SSL/TLS exceptions
- Common Kerberos exceptions
- Common SAML issues
- Internal Server Error in Kibana
- Setup-passwords command fails due to connection failure
- Failures due to relocation of the configuration files
- Limitations
- Alerting on Cluster and Index Events
- X-Pack APIs
- Info API
- Explore API
- Licensing APIs
- Migration APIs
- Machine Learning APIs
- Add Events to Calendar
- Add Jobs to Calendar
- Close Jobs
- Create Calendar
- Create Datafeeds
- Create Filter
- Create Jobs
- Delete Calendar
- Delete Datafeeds
- Delete Events from Calendar
- Delete Filter
- Delete Jobs
- Delete Jobs from Calendar
- Delete Model Snapshots
- Flush Jobs
- Forecast Jobs
- Get Calendars
- Get Buckets
- Get Overall Buckets
- Get Categories
- Get Datafeeds
- Get Datafeed Statistics
- Get Influencers
- Get Jobs
- Get Job Statistics
- Get Model Snapshots
- Get Scheduled Events
- Get Filters
- Get Records
- Open Jobs
- Post Data to Jobs
- Preview Datafeeds
- Revert Model Snapshots
- Start Datafeeds
- Stop Datafeeds
- Update Datafeeds
- Update Filter
- Update Jobs
- Update Model Snapshots
- Rollup APIs
- Security APIs
- Create or update application privileges API
- Authenticate API
- Change passwords API
- Clear Cache API
- Create or update role mappings API
- Clear roles cache API
- Create or update roles API
- Create or update users API
- Delete application privileges API
- Delete role mappings API
- Delete roles API
- Delete users API
- Disable users API
- Enable users API
- Get application privileges API
- Get role mappings API
- Get roles API
- Get token API
- Get users API
- Has Privileges API
- Invalidate token API
- SSL Certificate API
- Watcher APIs
- Definitions
- Command line tools
- How To
- Testing
- Glossary of terms
- Release Highlights
- Breaking changes
- Release Notes
- Elasticsearch version 6.4.3
- Elasticsearch version 6.4.2
- Elasticsearch version 6.4.1
- Elasticsearch version 6.4.0
- Elasticsearch version 6.3.2
- Elasticsearch version 6.3.1
- Elasticsearch version 6.3.0
- Elasticsearch version 6.2.4
- Elasticsearch version 6.2.3
- Elasticsearch version 6.2.2
- Elasticsearch version 6.2.1
- Elasticsearch version 6.2.0
- Elasticsearch version 6.1.4
- Elasticsearch version 6.1.3
- Elasticsearch version 6.1.2
- Elasticsearch version 6.1.1
- Elasticsearch version 6.1.0
- Elasticsearch version 6.0.1
- Elasticsearch version 6.0.0
- Elasticsearch version 6.0.0-rc2
- Elasticsearch version 6.0.0-rc1
- Elasticsearch version 6.0.0-beta2
- Elasticsearch version 6.0.0-beta1
- Elasticsearch version 6.0.0-alpha2
- Elasticsearch version 6.0.0-alpha1
- Elasticsearch version 6.0.0-alpha1 (Changes previously released in 5.x)
Cross-cluster search
editCross-cluster search
editThe cross-cluster search feature allows any node to act as a federated client across multiple clusters. In contrast to the tribe node feature, a cross-cluster search node won’t join the remote cluster, instead it connects to a remote cluster in a light fashion in order to execute federated search requests.
Cross-cluster search works by configuring a remote cluster in the cluster state and connecting only to a limited number of nodes in the remote cluster. Each remote cluster is referenced by a name and a list of seed nodes. When a remote cluster is registered, its cluster state is retrieved from one of the seed nodes so that up to 3 gateway nodes are selected to be connected to as part of upcoming cross-cluster search requests. Cross-cluster search requests consist of uni-directional connections from the coordinating node to the previously selected remote nodes only. It is possible to tag which nodes should be selected through node attributes (see Cross-cluster search settings).
Each node in a cluster that has remote clusters configured connects to one or more gateway nodes and uses them to federate search requests to the remote cluster.
Configuring cross-cluster search
editRemote clusters can be specified globally using cluster settings
(which can be updated dynamically), or local to individual nodes using the
elasticsearch.yml
file.
If a remote cluster is configured via elasticsearch.yml
only the nodes with
that configuration will be able to connect to the remote cluster. In other
words, federated search requests will have to be sent specifically to those
nodes. Remote clusters set via the cluster settings API
will be available on every node in the cluster.
This feature was added as Beta in Elasticsearch v5.3
with further improvements made in 5.4 and 5.5. It requires gateway eligible nodes to be on v5.5
onwards.
The elasticsearch.yml
config file for a cross-cluster search node just needs to list the
remote clusters that should be connected to, for instance:
|
The equivalent example using the cluster settings API to add remote clusters to all nodes in the cluster would look like the following:
PUT _cluster/settings { "persistent": { "search": { "remote": { "cluster_one": { "seeds": [ "127.0.0.1:9300" ] }, "cluster_two": { "seeds": [ "127.0.0.1:9301" ] }, "cluster_three": { "seeds": [ "127.0.0.1:9302" ] } } } } }
A remote cluster can be deleted from the cluster settings by setting its seeds to null
:
PUT _cluster/settings { "persistent": { "search": { "remote": { "cluster_three": { "seeds": null } } } } }
|
Using cross-cluster search
editTo search the twitter
index on remote cluster cluster_one
the index name
must be prefixed with the cluster alias separated by a :
character:
GET /cluster_one:twitter/_search { "query": { "match": { "user": "kimchy" } } }
{ "took": 150, "timed_out": false, "_shards": { "total": 1, "successful": 1, "failed": 0, "skipped": 0 }, "_clusters": { "total": 1, "successful": 1, "skipped": 0 }, "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "cluster_one:twitter", "_type": "_doc", "_id": "0", "_score": 1, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } } ] } }
In contrast to the tribe
feature cross-cluster search can also search indices with the same name on different
clusters:
GET /cluster_one:twitter,twitter/_search { "query": { "match": { "user": "kimchy" } } }
Search results are disambiguated the same way as the indices are disambiguated in the request. Even if index names are identical these indices will be treated as different indices when results are merged. All results retrieved from a remote index will be prefixed with their remote cluster name:
{ "took": 150, "timed_out": false, "_shards": { "total": 2, "successful": 2, "failed": 0, "skipped": 0 }, "_clusters": { "total": 2, "successful": 2, "skipped": 0 }, "hits": { "total": 2, "max_score": 1, "hits": [ { "_index": "cluster_one:twitter", "_type": "_doc", "_id": "0", "_score": 1, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } }, { "_index": "twitter", "_type": "_doc", "_id": "0", "_score": 2, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } } ] } }
Skipping disconnected clusters
editBy default all remote clusters that are searched via cross-cluster search need to be available when
the search request is executed, otherwise the whole request fails and no search results are returned
despite some of the clusters are available. Remote clusters can be made optional through the
boolean skip_unavailable
setting, set to false
by default.
GET /cluster_one:twitter,cluster_two:twitter,twitter/_search { "query": { "match": { "user": "kimchy" } } }
{ "took": 150, "timed_out": false, "_shards": { "total": 2, "successful": 2, "failed": 0, "skipped": 0 }, "_clusters": { "total": 3, "successful": 2, "skipped": 1 }, "hits": { "total": 2, "max_score": 1, "hits": [ { "_index": "cluster_one:twitter", "_type": "_doc", "_id": "0", "_score": 1, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } }, { "_index": "twitter", "_type": "_doc", "_id": "0", "_score": 2, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } } ] } }
Cross-cluster search settings
edit-
search.remote.connections_per_cluster
-
The number of nodes to connect to per remote cluster. The default is
3
. -
search.remote.initial_connect_timeout
-
The time to wait for remote connections to be established when the node starts. The default is
30s
. -
search.remote.node.attr
-
A node attribute to filter out nodes that are eligible as a gateway node in
the remote cluster. For instance a node can have a node attribute
node.attr.gateway: true
such that only nodes with this attribute will be connected to ifsearch.remote.node.attr
is set togateway
. -
search.remote.connect
-
By default, any node in the cluster can act as a cross-cluster client and
connect to remote clusters. The
search.remote.connect
setting can be set tofalse
(defaults totrue
) to prevent certain nodes from connecting to remote clusters. Cross-cluster search requests must be sent to a node that is allowed to act as a cross-cluster client. -
search.remote.${cluster_alias}.skip_unavailable
-
Per cluster boolean setting that allows to skip specific clusters when no
nodes belonging to them are available and they are searched as part of a
cross-cluster search request. Default is
false
, meaning that all clusters are mandatory by default, but they can selectively be made optional by setting this setting totrue
.
Retrieving remote clusters info
editThe Remote Cluster Info API allows to retrieve information about the configured remote clusters, as well as the remote nodes that the cross-cluster search node is connected to.
On this page