- Elasticsearch Guide: other versions:
- Elasticsearch introduction
- Getting started with Elasticsearch
- Set up Elasticsearch
- Installing Elasticsearch
- Configuring Elasticsearch
- Important Elasticsearch configuration
- Important System Configuration
- Bootstrap Checks
- Heap size check
- File descriptor check
- Memory lock check
- Maximum number of threads check
- Max file size check
- Maximum size virtual memory check
- Maximum map count check
- Client JVM check
- Use serial collector check
- System call filter check
- OnError and OnOutOfMemoryError checks
- Early-access check
- G1GC check
- All permission check
- Discovery configuration check
- Starting Elasticsearch
- Stopping Elasticsearch
- Adding nodes to your cluster
- Full-cluster restart and rolling restart
- Set up X-Pack
- Configuring X-Pack Java Clients
- Bootstrap Checks for X-Pack
- Upgrade Elasticsearch
- Aggregations
- Metrics Aggregations
- Avg Aggregation
- Weighted Avg Aggregation
- Cardinality Aggregation
- Extended Stats Aggregation
- Geo Bounds Aggregation
- Geo Centroid Aggregation
- Max Aggregation
- Min Aggregation
- Percentiles Aggregation
- Percentile Ranks Aggregation
- Scripted Metric Aggregation
- Stats Aggregation
- Sum Aggregation
- Top Hits Aggregation
- Value Count Aggregation
- Median Absolute Deviation Aggregation
- Bucket Aggregations
- Adjacency Matrix Aggregation
- Auto-interval Date Histogram Aggregation
- Children Aggregation
- Composite Aggregation
- Date histogram aggregation
- Date Range Aggregation
- Diversified Sampler Aggregation
- Filter Aggregation
- Filters Aggregation
- Geo Distance Aggregation
- GeoHash grid Aggregation
- GeoTile Grid Aggregation
- Global Aggregation
- Histogram Aggregation
- IP Range Aggregation
- Missing Aggregation
- Nested Aggregation
- Parent Aggregation
- Range Aggregation
- Rare Terms Aggregation
- Reverse nested Aggregation
- Sampler Aggregation
- Significant Terms Aggregation
- Significant Text Aggregation
- Terms Aggregation
- Subtleties of bucketing range fields
- Pipeline Aggregations
- Avg Bucket Aggregation
- Derivative Aggregation
- Max Bucket Aggregation
- Min Bucket Aggregation
- Sum Bucket Aggregation
- Stats Bucket Aggregation
- Extended Stats Bucket Aggregation
- Percentiles Bucket Aggregation
- Moving Average Aggregation
- Moving Function Aggregation
- Cumulative Sum Aggregation
- Cumulative Cardinality Aggregation
- Bucket Script Aggregation
- Bucket Selector Aggregation
- Bucket Sort Aggregation
- Serial Differencing Aggregation
- Matrix Aggregations
- Caching heavy aggregations
- Returning only aggregation results
- Aggregation Metadata
- Returning the type of the aggregation
- Metrics Aggregations
- Query DSL
- Search across clusters
- Scripting
- Mapping
- Analysis
- Anatomy of an analyzer
- Testing analyzers
- Analyzers
- Normalizers
- Tokenizers
- Char Group Tokenizer
- Classic Tokenizer
- Edge n-gram tokenizer
- Keyword Tokenizer
- Letter Tokenizer
- Lowercase Tokenizer
- N-gram tokenizer
- Path Hierarchy Tokenizer
- Path Hierarchy Tokenizer Examples
- Pattern Tokenizer
- Simple Pattern Tokenizer
- Simple Pattern Split Tokenizer
- Standard Tokenizer
- Thai Tokenizer
- UAX URL Email Tokenizer
- Whitespace Tokenizer
- Token Filters
- Apostrophe
- ASCII folding
- CJK bigram
- CJK width
- Classic
- Common grams
- Conditional
- Decimal digit
- Delimited payload
- Dictionary decompounder
- Edge n-gram
- Elision
- Fingerprint
- Flatten Graph Token Filter
- Hunspell Token Filter
- Hyphenation decompounder
- Keep types
- Keep words
- Keyword Marker Token Filter
- Keyword Repeat Token Filter
- KStem Token Filter
- Length Token Filter
- Limit Token Count Token Filter
- Lowercase Token Filter
- MinHash Token Filter
- Multiplexer Token Filter
- N-gram
- Normalization Token Filter
- Pattern Capture Token Filter
- Pattern Replace Token Filter
- Phonetic Token Filter
- Porter Stem Token Filter
- Predicate Token Filter Script
- Remove Duplicates Token Filter
- Reverse Token Filter
- Shingle Token Filter
- Snowball Token Filter
- Stemmer Token Filter
- Stemmer Override Token Filter
- Stop Token Filter
- Synonym Token Filter
- Synonym Graph Token Filter
- Trim Token Filter
- Truncate Token Filter
- Unique Token Filter
- Uppercase Token Filter
- Word Delimiter Token Filter
- Word Delimiter Graph Token Filter
- Character Filters
- Modules
- Index modules
- Ingest node
- Pipeline Definition
- Accessing Data in Pipelines
- Conditional Execution in Pipelines
- Handling Failures in Pipelines
- Processors
- Append Processor
- Bytes Processor
- Circle Processor
- Convert Processor
- Date Processor
- Date Index Name Processor
- Dissect Processor
- Dot Expander Processor
- Drop Processor
- Fail Processor
- Foreach Processor
- GeoIP Processor
- Grok Processor
- Gsub Processor
- HTML Strip Processor
- Join Processor
- JSON Processor
- KV Processor
- Lowercase Processor
- Pipeline Processor
- Remove Processor
- Rename Processor
- Script Processor
- Set Processor
- Set Security User Processor
- Split Processor
- Sort Processor
- Trim Processor
- Uppercase Processor
- URL Decode Processor
- User Agent processor
- Managing the index lifecycle
- Getting started with index lifecycle management
- Policy phases and actions
- Set up index lifecycle management policy
- Using policies to manage index rollover
- Update policy
- Index lifecycle error handling
- Restoring snapshots of managed indices
- Start and stop index lifecycle management
- Using ILM with existing indices
- Getting started with snapshot lifecycle management
- SQL access
- Overview
- Getting Started with SQL
- Conventions and Terminology
- Security
- SQL REST API
- SQL Translate API
- SQL CLI
- SQL JDBC
- SQL ODBC
- SQL Client Applications
- SQL Language
- Functions and Operators
- Comparison Operators
- Logical Operators
- Math Operators
- Cast Operators
- LIKE and RLIKE Operators
- Aggregate Functions
- Grouping Functions
- Date/Time and Interval Functions and Operators
- Full-Text Search Functions
- Mathematical Functions
- String Functions
- Type Conversion Functions
- Geo Functions
- Conditional Functions And Expressions
- System Functions
- Reserved keywords
- SQL Limitations
- Monitor a cluster
- Frozen indices
- Roll up or transform your data
- Set up a cluster for high availability
- Secure a cluster
- Overview
- Configuring security
- User authentication
- Built-in users
- Internal users
- Token-based authentication services
- Realms
- Realm chains
- Active Directory user authentication
- File-based user authentication
- LDAP user authentication
- Native user authentication
- OpenID Connect authentication
- PKI user authentication
- SAML authentication
- Kerberos authentication
- Integrating with other authentication systems
- Enabling anonymous access
- Controlling the user cache
- Configuring SAML single-sign-on on the Elastic Stack
- Configuring single sign-on to the Elastic Stack using OpenID Connect
- User authorization
- Built-in roles
- Defining roles
- Security privileges
- Document level security
- Field level security
- Granting privileges for indices and aliases
- Mapping users and groups to roles
- Setting up field and document level security
- Submitting requests on behalf of other users
- Configuring authorization delegation
- Customizing roles and authorization
- Enabling audit logging
- Encrypting communications
- Restricting connections with IP filtering
- Cross cluster search, clients, and integrations
- Tutorial: Getting started with security
- Tutorial: Encrypting communications
- Troubleshooting
- Some settings are not returned via the nodes settings API
- Authorization exceptions
- Users command fails due to extra arguments
- Users are frequently locked out of Active Directory
- Certificate verification fails for curl on Mac
- SSLHandshakeException causes connections to fail
- Common SSL/TLS exceptions
- Common Kerberos exceptions
- Common SAML issues
- Internal Server Error in Kibana
- Setup-passwords command fails due to connection failure
- Failures due to relocation of the configuration files
- Limitations
- Alerting on cluster and index events
- Command line tools
- How To
- Testing
- Glossary of terms
- REST APIs
- API conventions
- cat APIs
- Cluster APIs
- Cross-cluster replication APIs
- Document APIs
- Explore API
- Index APIs
- Add index alias
- Analyze
- Clear cache
- Clone index
- Close index
- Create index
- Delete index
- Delete index alias
- Delete index template
- Flush
- Force merge
- Freeze index
- Get field mapping
- Get index
- Get index alias
- Get index settings
- Get index template
- Get mapping
- Index alias exists
- Index exists
- Index recovery
- Index segments
- Index shard stores
- Index stats
- Index template exists
- Open index
- Put index template
- Put mapping
- Refresh
- Rollover index
- Shrink index
- Split index
- Synced flush
- Type exists
- Unfreeze index
- Update index alias
- Update index settings
- Index lifecycle management API
- Ingest APIs
- Info API
- Licensing APIs
- Machine learning anomaly detection APIs
- Add events to calendar
- Add jobs to calendar
- Close jobs
- Create jobs
- Create calendar
- Create datafeeds
- Create filter
- Delete calendar
- Delete datafeeds
- Delete events from calendar
- Delete filter
- Delete forecast
- Delete jobs
- Delete jobs from calendar
- Delete model snapshots
- Delete expired data
- Find file structure
- Flush jobs
- Forecast jobs
- Get buckets
- Get calendars
- Get categories
- Get datafeeds
- Get datafeed statistics
- Get influencers
- Get jobs
- Get job statistics
- Get machine learning info
- Get model snapshots
- Get overall buckets
- Get scheduled events
- Get filters
- Get records
- Open jobs
- Post data to jobs
- Preview datafeeds
- Revert model snapshots
- Set upgrade mode
- Start datafeeds
- Stop datafeeds
- Update datafeeds
- Update filter
- Update jobs
- Update model snapshots
- Machine learning data frame analytics APIs
- Migration APIs
- Reload search analyzers
- Rollup APIs
- Search APIs
- Security APIs
- Authenticate
- Change passwords
- Clear cache
- Clear roles cache
- Create API keys
- Create or update application privileges
- Create or update role mappings
- Create or update roles
- Create or update users
- Delegate PKI authentication
- Delete application privileges
- Delete role mappings
- Delete roles
- Delete users
- Disable users
- Enable users
- Get API key information
- Get application privileges
- Get builtin privileges
- Get role mappings
- Get roles
- Get token
- Get users
- Has privileges
- Invalidate API key
- Invalidate token
- OpenID Connect Prepare Authentication API
- OpenID Connect authenticate API
- OpenID Connect logout API
- SSL certificate
- Snapshot lifecycle management API
- Transform APIs
- Watcher APIs
- Definitions
- Release highlights
- Breaking changes
- Release notes
- Elasticsearch version 7.4.2
- Elasticsearch version 7.4.1
- Elasticsearch version 7.4.0
- Elasticsearch version 7.3.2
- Elasticsearch version 7.3.1
- Elasticsearch version 7.3.0
- Elasticsearch version 7.2.1
- Elasticsearch version 7.2.0
- Elasticsearch version 7.1.1
- Elasticsearch version 7.1.0
- Elasticsearch version 7.0.0
- Elasticsearch version 7.0.0-rc2
- Elasticsearch version 7.0.0-rc1
- Elasticsearch version 7.0.0-beta1
- Elasticsearch version 7.0.0-alpha2
- Elasticsearch version 7.0.0-alpha1
7.3.0 release highlights
edit7.3.0 release highlights
editVoting-only master nodes
editA new node.voting-only
role has been
introduced that allows nodes to participate in elections even though they are
not eligible to become the master.
The benefit is that these nodes still help with high availability while
requiring less CPU and heap than master nodes.
The node.voting-only
role is only available with the default
distribution of Elasticsearch.
Reloading of search-time synonyms
editA new Analyzer reload API allows to reload the definition of search-time analyzers and their associated resources. A common use-case for this API is the reloading of search-time synonyms. In earlier versions of Elasticsearch, users could force synonyms to be reloaded by closing the index and then opening it again. With this new API, synonyms can be updated without closing the index.
The Analyzer reload API is only available with the default distribution of Elasticsearch.
New flattened
field type
editA new flattened
field type has been added, which can index
arbitrary json
objects into a single field. This helps avoid hitting issues
due to many fields in mappings, at the cost of more limited search
functionality.
The flattened
field type is only available with the
default distribution of Elasticsearch.
Functions on vector fields
editPainless now support computing the
cosine similarity and
the dot product of a
query vector and either values of a
sparse_vector
or
dense_vector
field.
These functions are only available with the default distribution of Elasticsearch.
Prefix and wildcard support for intervals
editIntervals now support querying by prefix or wildcard.
Rare terms aggregation
editA new
Rare Terms aggregation
allows to find the least frequent values in a field. It is intended to replace
the "order" : { "_count" : "asc" }
option of the
terms aggregations.
Aliases are replicated via cross-cluster replication
editRead aliases are now replicated via cross-cluster replication. Note that write aliases are still not replicated since they only make sense for indices that are being written to while follower indices do not receive direct writes.
SQL supports frozen indices
editElasticsearch SQL now supports querying frozen indices via the
new FROZEN
keyword.
Fixed memory leak when using templates in document-level security
editDocument-level security was using an unbounded cache for the set of visible documents. This could lead to a memory leak when using a templated query as a role query. The cache has been fixed to evict based on memory usage and has a limit of 50MB.
More memory-efficient aggregations on keyword
fields
editTerms aggregations generally need to build global ordinals in order to run. Unfortunately this operation became more memory-intensive in 6.0 due to the move to doc-value iterators in order to improve handling of sparse fields. Memory pressure of global ordinals now goes back to a more similar level as what you could have on pre-6.0 releases.
Data frames: transform and pivot your streaming data
edit[beta] This functionality is in beta and is subject to change. The design and code is less mature than official GA features and is being provided as-is with no warranties. Beta features are not subject to the support SLA of official GA features. Transforms are a core new feature in Elasticsearch that enable you to transform an existing index to a secondary, summarized index. Transforms enable you to pivot your data and create entity-centric indices that can summarize the behavior of an entity. This organizes the data into an analysis-friendly format.
Transforms were originally available in 7.2. With 7.3 they can now run either as a single batch transform or continuously incorporating new data as it is ingested.
Data frames enable new possibilities for machine learning analysis (such as outlier detection), but they can also be useful for other types of visualizations and custom types of analysis.
Discover your most unusual data using outlier detection
editThe goal of outlier detection is to find the most unusual data points in an index. We analyse the numerical fields of each data point (document in an index) and annotate them with how unusual they are.
We use unsupervised outlier detection which means there is no need to provide a training data set to teach outlier detection to recognize outliers. In practice, this is achieved by using an ensemble of distance based and density based techniques to identify those data points which are the most different from the bulk of the data in the index. We assign to each analysed data point an outlier score, which captures how different the entity is from other entities in the index.
In addition to new outlier detection functionality, we are introducing the evaluate data frame analytics API, which enables you to compute a range of performance metrics such as confusion matrices, precision, recall, the receiver-operating characteristics (ROC) curve and the area under the ROC curve. If you are running outlier detection on a source index that has already been labeled to indicate which points are truly outliers and which are normal, you can use the evaluate data frame analytics API to assess the performance of the outlier detection analytics on your dataset.
On this page
- Voting-only master nodes
- Reloading of search-time synonyms
- New
flattened
field type - Functions on vector fields
- Prefix and wildcard support for intervals
- Rare terms aggregation
- Aliases are replicated via cross-cluster replication
- SQL supports frozen indices
- Fixed memory leak when using templates in document-level security
- More memory-efficient aggregations on
keyword
fields - Data frames: transform and pivot your streaming data
- Discover your most unusual data using outlier detection