- Elasticsearch Guide: other versions:
- Getting Started
- Set up Elasticsearch
- Installing Elasticsearch
- Configuring Elasticsearch
- Important Elasticsearch configuration
- Important System Configuration
- Bootstrap Checks
- Heap size check
- File descriptor check
- Memory lock check
- Maximum number of threads check
- Maximum size virtual memory check
- Max file size check
- Maximum map count check
- Client JVM check
- Use serial collector check
- System call filter check
- OnError and OnOutOfMemoryError checks
- Early-access check
- G1GC check
- Stopping Elasticsearch
- Upgrade Elasticsearch
- Set up X-Pack
- Breaking changes
- Breaking changes in 6.0
- Aggregations changes
- Analysis changes
- Cat API changes
- Clients changes
- Cluster changes
- Document API changes
- Indices changes
- Ingest changes
- Java API changes
- Mapping changes
- Packaging changes
- Percolator changes
- Plugins changes
- Reindex changes
- REST changes
- Scripting changes
- Search and Query DSL changes
- Settings changes
- Stats and info changes
- Breaking changes in 6.1
- Breaking changes in 6.0
- X-Pack Breaking Changes
- API Conventions
- Document APIs
- Search APIs
- Aggregations
- Metrics Aggregations
- Avg Aggregation
- Cardinality Aggregation
- Extended Stats Aggregation
- Geo Bounds Aggregation
- Geo Centroid Aggregation
- Max Aggregation
- Min Aggregation
- Percentiles Aggregation
- Percentile Ranks Aggregation
- Scripted Metric Aggregation
- Stats Aggregation
- Sum Aggregation
- Top Hits Aggregation
- Value Count Aggregation
- Bucket Aggregations
- Adjacency Matrix Aggregation
- Children Aggregation
- Composite Aggregation
- Date Histogram Aggregation
- Date Range Aggregation
- Diversified Sampler Aggregation
- Filter Aggregation
- Filters Aggregation
- Geo Distance Aggregation
- GeoHash grid Aggregation
- Global Aggregation
- Histogram Aggregation
- IP Range Aggregation
- Missing Aggregation
- Nested Aggregation
- Range Aggregation
- Reverse nested Aggregation
- Sampler Aggregation
- Significant Terms Aggregation
- Significant Text Aggregation
- Terms Aggregation
- Pipeline Aggregations
- Avg Bucket Aggregation
- Derivative Aggregation
- Max Bucket Aggregation
- Min Bucket Aggregation
- Sum Bucket Aggregation
- Stats Bucket Aggregation
- Extended Stats Bucket Aggregation
- Percentiles Bucket Aggregation
- Moving Average Aggregation
- Cumulative Sum Aggregation
- Bucket Script Aggregation
- Bucket Selector Aggregation
- Bucket Sort Aggregation
- Serial Differencing Aggregation
- Matrix Aggregations
- Caching heavy aggregations
- Returning only aggregation results
- Aggregation Metadata
- Returning the type of the aggregation
- Metrics Aggregations
- Indices APIs
- Create Index
- Delete Index
- Get Index
- Indices Exists
- Open / Close Index API
- Shrink Index
- Split Index
- Rollover Index
- Put Mapping
- Get Mapping
- Get Field Mapping
- Types Exists
- Index Aliases
- Update Indices Settings
- Get Settings
- Analyze
- Index Templates
- Indices Stats
- Indices Segments
- Indices Recovery
- Indices Shard Stores
- Clear Cache
- Flush
- Refresh
- Force Merge
- cat APIs
- Cluster APIs
- Query DSL
- Mapping
- Analysis
- Anatomy of an analyzer
- Testing analyzers
- Analyzers
- Normalizers
- Tokenizers
- Token Filters
- Standard Token Filter
- ASCII Folding Token Filter
- Flatten Graph Token Filter
- Length Token Filter
- Lowercase Token Filter
- Uppercase Token Filter
- NGram Token Filter
- Edge NGram Token Filter
- Porter Stem Token Filter
- Shingle Token Filter
- Stop Token Filter
- Word Delimiter Token Filter
- Word Delimiter Graph Token Filter
- Stemmer Token Filter
- Stemmer Override Token Filter
- Keyword Marker Token Filter
- Keyword Repeat Token Filter
- KStem Token Filter
- Snowball Token Filter
- Phonetic Token Filter
- Synonym Token Filter
- Synonym Graph Token Filter
- Compound Word Token Filters
- Reverse Token Filter
- Elision Token Filter
- Truncate Token Filter
- Unique Token Filter
- Pattern Capture Token Filter
- Pattern Replace Token Filter
- Trim Token Filter
- Limit Token Count Token Filter
- Hunspell Token Filter
- Common Grams Token Filter
- Normalization Token Filter
- CJK Width Token Filter
- CJK Bigram Token Filter
- Delimited Payload Token Filter
- Keep Words Token Filter
- Keep Types Token Filter
- Classic Token Filter
- Apostrophe Token Filter
- Decimal Digit Token Filter
- Fingerprint Token Filter
- Minhash Token Filter
- Character Filters
- Modules
- Index Modules
- Ingest Node
- Pipeline Definition
- Ingest APIs
- Accessing Data in Pipelines
- Handling Failures in Pipelines
- Processors
- Append Processor
- Convert Processor
- Date Processor
- Date Index Name Processor
- Fail Processor
- Foreach Processor
- Grok Processor
- Gsub Processor
- Join Processor
- JSON Processor
- KV Processor
- Lowercase Processor
- Remove Processor
- Rename Processor
- Script Processor
- Set Processor
- Split Processor
- Sort Processor
- Trim Processor
- Uppercase Processor
- Dot Expander Processor
- URL Decode Processor
- Monitoring Elasticsearch
- X-Pack APIs
- Info API
- Explore API
- Machine Learning APIs
- Close Jobs
- Create Datafeeds
- Create Jobs
- Delete Datafeeds
- Delete Jobs
- Delete Model Snapshots
- Flush Jobs
- Forecast Jobs
- Get Buckets
- Get Overall Buckets
- Get Categories
- Get Datafeeds
- Get Datafeed Statistics
- Get Influencers
- Get Jobs
- Get Job Statistics
- Get Model Snapshots
- Get Records
- Open Jobs
- Post Data to Jobs
- Preview Datafeeds
- Revert Model Snapshots
- Start Datafeeds
- Stop Datafeeds
- Update Datafeeds
- Update Jobs
- Update Model Snapshots
- Security APIs
- Watcher APIs
- Migration APIs
- Deprecation Info APIs
- Definitions
- X-Pack Commands
- How To
- Testing
- Glossary of terms
- Release Notes
- 6.1.4 Release Notes
- 6.1.3 Release Notes
- 6.1.2 Release Notes
- 6.1.1 Release Notes
- 6.1.0 Release Notes
- 6.0.1 Release Notes
- 6.0.0 Release Notes
- 6.0.0-rc2 Release Notes
- 6.0.0-rc1 Release Notes
- 6.0.0-beta2 Release Notes
- 6.0.0-beta1 Release Notes
- 6.0.0-alpha2 Release Notes
- 6.0.0-alpha1 Release Notes
- 6.0.0-alpha1 Release Notes (Changes previously released in 5.x)
- X-Pack Release Notes
WARNING: Version 6.1 of Elasticsearch has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
Important Elasticsearch configuration
editImportant Elasticsearch configuration
editWhile Elasticsearch requires very little configuration, there are a number of settings which need to be configured manually and should definitely be configured before going into production.
path.data
and path.logs
editIf you are using the .zip
or .tar.gz
archives, the data
and logs
directories are sub-folders of $ES_HOME
. If these important folders are
left in their default locations, there is a high risk of them being deleted
while upgrading Elasticsearch to a new version.
In production use, you will almost certainly want to change the locations of the data and log folder:
path: logs: /var/log/elasticsearch data: /var/data/elasticsearch
The RPM and Debian distributions already use custom paths for data
and
logs
.
The path.data
settings can be set to multiple paths, in which case all paths
will be used to store data (although the files belonging to a single shard
will all be stored on the same data path):
path: data: - /mnt/elasticsearch_1 - /mnt/elasticsearch_2 - /mnt/elasticsearch_3
cluster.name
editA node can only join a cluster when it shares its cluster.name
with all the
other nodes in the cluster. The default name is elasticsearch
, but you
should change it to an appropriate name which describes the purpose of the
cluster.
cluster.name: logging-prod
Make sure that you don’t reuse the same cluster names in different environments, otherwise you might end up with nodes joining the wrong cluster.
node.name
editBy default, Elasticsearch will use the first seven characters of the randomly generated UUID as the node id. Note that the node id is persisted and does not change when a node restarts and therefore the default node name will also not change.
It is worth configuring a more meaningful name which will also have the advantage of persisting after restarting the node:
node.name: prod-data-2
The node.name
can also be set to the server’s HOSTNAME as follows:
node.name: ${HOSTNAME}
bootstrap.memory_lock
editIt is vitally important to the health of your node that none of the JVM is
ever swapped out to disk. One way of achieving that is set the
bootstrap.memory_lock
setting to true
.
For this setting to have effect, other system settings need to be configured
first. See Enable bootstrap.memory_lock
for more details about how to set up memory locking
correctly.
network.host
editBy default, Elasticsearch binds to loopback addresses only — e.g. 127.0.0.1
and [::1]
. This is sufficient to run a single development node on a server.
In fact, more than one node can be started from the same $ES_HOME
location
on a single node. This can be useful for testing Elasticsearch’s ability to
form clusters, but it is not a configuration recommended for production.
In order to communicate and to form a cluster with nodes on other servers,
your node will need to bind to a non-loopback address. While there are many
network settings, usually all you need to configure is
network.host
:
network.host: 192.168.1.10
The network.host
setting also understands some special values such as
_local_
, _site_
, _global_
and modifiers like :ip4
and :ip6
, details
of which can be found in Special values for network.host
.
As soon you provide a custom setting for network.host
,
Elasticsearch assumes that you are moving from development mode to production
mode, and upgrades a number of system startup checks from warnings to
exceptions. See Development mode vs production mode for more information.
discovery.zen.ping.unicast.hosts
editOut of the box, without any network configuration, Elasticsearch will bind to the available loopback addresses and will scan ports 9300 to 9305 to try to connect to other nodes running on the same server. This provides an auto- clustering experience without having to do any configuration.
When the moment comes to form a cluster with nodes on other servers, you have to provide a seed list of other nodes in the cluster that are likely to be live and contactable. This can be specified as follows:
The port will default to |
|
A hostname that resolves to multiple IP addresses will try all resolved addresses. |
discovery.zen.minimum_master_nodes
editTo prevent data loss, it is vital to configure the
discovery.zen.minimum_master_nodes
setting so that each master-eligible node
knows the minimum number of master-eligible nodes that must be visible in
order to form a cluster.
Without this setting, a cluster that suffers a network failure is at risk of
having the cluster split into two independent clusters — a split brain — which will lead to data loss. A more detailed explanation is provided
in Avoiding split brain with minimum_master_nodes
.
To avoid a split brain, this setting should be set to a quorum of master- eligible nodes:
(master_eligible_nodes / 2) + 1
In other words, if there are three master-eligible nodes, then minimum master
nodes should be set to (3 / 2) + 1
or 2
:
discovery.zen.minimum_master_nodes: 2
JVM heap dump path
editThe RPM and Debian package distributions default to configuring
the JVM to dump the heap on out of memory exceptions to
/var/lib/elasticsearch
. If this path is not suitable for storing heap dumps,
you should modify the entry -XX:HeapDumpPath=/var/lib/elasticsearch
in
jvm.options
to an alternate path. If you specify a filename
instead of a directory, the JVM will repeatedly use the same file; this is one
mechanism for preventing heap dumps from accumulating in the heap dump path.
Alternatively, you can configure a scheduled task via your OS to remove heap
dumps that are older than a configured age.
Note that the archive distributions do not configure the heap dump path by
default. Instead, the JVM will default to dumping to the working directory for
the Elasticsearch process. If you wish to configure a heap dump path, you should
modify the entry #-XX:HeapDumpPath=/heap/dump/path
in
jvm.options
to remove the comment marker #
and to specify an
actual path.
On this page