Elasticsearch Guide: other versions:
Elasticsearch introduction
- Data in: documents and indices
- Information out: search and analyze
- Scalability and resilience
Getting started with Elasticsearch
- Get Elasticsearch up and running
- Index some documents
- Start searching
- Analyze results with aggregations
- Where to go from here
Set up Elasticsearch
- Installing Elasticsearch
- Configuring Elasticsearch
- Important Elasticsearch configuration
- Important System Configuration
- Bootstrap Checks
- Starting Elasticsearch
- Stopping Elasticsearch
- Adding nodes to your cluster
- Set up X-Pack
- Configuring X-Pack Java Clients
- Bootstrap Checks for X-Pack
Upgrade Elasticsearch
- Rolling upgrades
- Full cluster restart upgrade
- Reindex before upgrading
  - Reindex in place
  - Reindex from a remote cluster
API conventions
- Multiple Indices
- Date math support in index names
- Common options
- URL-based access control
Document APIs
- Reading and Writing documents
- Index API
- Get API
- Delete API
- Delete By Query API
- Update API
- Update By Query API
- Multi Get API
- Bulk API
- Reindex API
- Term Vectors
- Multi termvectors API
- ?refresh
- Optimistic concurrency control
Search APIs
- Search
- URI Search
- Request Body Search
- Search Template
- Multi Search Template
- Search Shards API
- Suggesters
- Multi Search API
- Count API
- Validate API
- Explain API
- Profile API
- Field Capabilities API
- Ranking Evaluation API
Aggregations
- Metrics Aggregations
- Bucket Aggregations
- Pipeline Aggregations
- Matrix Aggregations
  - Matrix Stats
- Caching heavy aggregations
- Returning only aggregation results
- Aggregation Metadata
- Returning the type of the aggregation
Indices APIs
- Create Index
- Delete Index
- Get Index
- Indices Exists
- Open / Close Index API
- Shrink Index
- Split Index
- Rollover Index
- Put Mapping
- Get Mapping
- Get Field Mapping
- Types Exists
- Index Aliases
- Update Indices Settings
- Get Settings
- Analyze
  - Explain Analyze
- Index Templates
- Indices Stats
- Indices Segments
- Indices Recovery
- Indices Shard Stores
- Clear Cache
- Flush
  - Synced Flush
- Refresh
- Force Merge
cat APIs
- cat aliases
- cat allocation
- cat count
- cat fielddata
- cat health
- cat indices
- cat master
- cat nodeattrs
- cat nodes
- cat pending tasks
- cat plugins
- cat recovery
- cat repositories
- cat thread pool
- cat shards
- cat segments
- cat snapshots
- cat templates
Cluster APIs
- Cluster Health
- Cluster State
- Cluster Stats
- Pending cluster tasks
- Cluster Reroute
- Cluster Update Settings
- Cluster Get Settings
- Nodes Stats
- Nodes Info
- Nodes Feature Usage
- Remote Cluster Info
- Task Management API
- Nodes hot_threads
- Cluster Allocation Explain API
- Voting Configuration Exclusions
Query DSL
- Query and filter context
- Compound queries
- Full text queries
- Geo queries
- Joining queries
  - Nested
  - Has child
  - Has parent
  - Parent ID
- Match all
- Span queries
- Specialized queries
- Term-level queries
  - Exists
  - Fuzzy
  - IDs
  - Prefix
  - Range
  - Regexp
  - Term
  - Terms
  - Terms set
  - Type Query
  - Wildcard
- minimum_should_match parameter
- rewrite parameter
- Regular expression syntax
Scripting
- How to use scripts
- Accessing document fields and special variables
- Scripting and security
- Painless scripting language
- Lucene expressions language
- Advanced scripts using script engines
Mapping
- Removal of mapping types
- Field datatypes
  - Alias
  - Arrays
  - Binary
  - Boolean
  - Date
  - Date nanoseconds
  - Dense vector
  - Geo-point
  - Geo-shape
  - IP
  - Join
  - Keyword
  - Nested
  - Numeric
  - Object
  - Percolator
  - Range
  - Rank feature
  - Rank features
  - Sparse vector
  - Text
  - Token count
- Meta-Fields
- Mapping parameters
- Dynamic Mapping
  - Dynamic field mapping
  - Dynamic templates
Analysis
- Anatomy of an analyzer
- Testing analyzers
- Analyzers
- Normalizers
- Tokenizers
- Token Filters
- Character Filters
Modules
- Discovery and cluster formation
- Shard allocation and cluster-level routing
- Local Gateway
  - Dangling indices
- HTTP
- Indices
- Network Settings
- Node
- Plugins
- Snapshot and Restore
- Thread Pool
- Transport
- Remote clusters
- Cross-cluster search
Index modules
- Analysis
- Index Shard Allocation
- Mapper
- Merge
- Similarity module
- Slow Log
- Store
  - Preloading data into the file system cache
- Translog
- Index Sorting
  - Use index sorting to speed up conjunctions
Ingest node
- Pipeline Definition
- Ingest APIs
- Accessing Data in Pipelines
- Conditional Execution in Pipelines
- Handling Failures in Pipelines
- Processors
Managing the index lifecycle
- Getting started with index lifecycle management
- Policy phases and actions
- Set up index lifecycle management policy
  - Applying a policy to an index template
  - Apply a policy to a create index request
- Using policies to manage index rollover
  - Skipping Rollover
- Update policy
- Index lifecycle error handling
- Restoring snapshots of managed indices
- Start and stop index lifecycle management
- Using ILM with existing indices
  - Managing existing periodic indices with ILM
  - Reindexing via ILM
SQL access
- Overview
- Getting Started with SQL
- Conventions and Terminology
  - Mapping concepts across SQL and Elasticsearch
- Security
- SQL REST API
- SQL Translate API
- SQL CLI
- SQL JDBC
  - API usage
- SQL ODBC
  - Driver installation
  - Configuration
- SQL Client Applications
- SQL Language
- Functions and Operators
- Reserved keywords
- SQL Limitations
Monitor a cluster
- Overview
- How it works
- Monitoring in a production environment
- Elastic Stack Monitoring Service
- Collecting monitoring data
  - Pausing data collection
- Collecting monitoring data with Metricbeat
- Configuring indices for monitoring
- Collectors
- Exporters
  - Local exporters
  - HTTP exporters
- Troubleshooting
Rolling up historical data
- Overview
- API Quick Reference
- Getting Started
- Understanding Groups
  - Grouping Limitations with heterogeneous indices
  - Doc counts and overlapping jobs
- Rollup Aggregation Limitations
- Rollup Search Limitations
Frozen indices
- Best practices
- Searching a frozen index
- Monitoring frozen indices
Set up a cluster for high availability
- Back up a cluster
- Cross-cluster replication
X-Pack APIs
- Info API
- Cross-cluster replication APIs
- Explore API
- Freeze index
- Index lifecycle management API
- Licensing APIs
- Migration APIs
  - Deprecation info
- Machine learning APIs
- Rollup APIs
- Security APIs
- Unfreeze index
- Watcher APIs
  - Put watch
  - Get watch
  - Delete watch
  - Execute watch
  - Ack watch
  - Activate watch
  - Deactivate watch
  - Stats
  - Stop
  - Start
- Definitions
Secure a cluster
- Overview
- Configuring security
- How security works
- User authentication
- Configuring SAML single-sign-on on the Elastic Stack
- User authorization
- Auditing security events
- Encrypting communications
  - Setting Up TLS on a cluster
- Restricting connections with IP filtering
- Cross cluster search, clients, and integrations
- Tutorial: Getting started with security
- Tutorial: Encrypting communications
- Troubleshooting
- Limitations
Alerting on cluster and index events
- Getting started with Watcher
- How Watcher works
- Encrypting sensitive data in Watcher
- Inputs
- Triggers
  - Schedule trigger
- Conditions
- Actions
- Transforms
- Java API
- Managing watches
- Example watches
  - Watching the status of an Elasticsearch cluster
  - Watching event data
- Watcher
- Watcher limitations
Command line tools
- elasticsearch-certgen
- elasticsearch-certutil
- elasticsearch-migrate
- elasticsearch-node
- elasticsearch-saml-metadata
- elasticsearch-setup-passwords
- elasticsearch-shard
- elasticsearch-syskeygen
- elasticsearch-users
How To
- General recommendations
- Recipes
- Tune for indexing speed
- Tune for search speed
- Tune for disk usage
Testing
- Java Testing Framework
Glossary of terms
Release highlights
- 7.0.0
Breaking changes
- 7.0
Release notes
- Elasticsearch version 7.0.1
- Elasticsearch version 7.0.0
- Elasticsearch version 7.0.0-rc2
- Elasticsearch version 7.0.0-rc1
- Elasticsearch version 7.0.0-beta1
- Elasticsearch version 7.0.0-alpha2
- Elasticsearch version 7.0.0-alpha1

IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

› › ›

Lexical Structure

edit

IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

Lexical Structure

edit

This section covers the major lexical structure of SQL, which for the most part, is going to resemble that of ANSI SQL itself hence why low-levels details are not discussed in depth.

Elasticsearch SQL currently accepts only one command at a time. A command is a sequence of tokens terminated by the end of input stream.

A token can be a key word, an identifier (quoted or unquoted), a literal (or constant) or a special character symbol (typically a delimiter). Tokens are typically separated by whitespace (be it space, tab) though in some cases, where there is no ambiguity (typically due to a character symbol) this is not needed - however for readability purposes this should be avoided.

Key Words

edit

Take the following example:

SELECT * FROM table

This query has four tokens: SELECT, *, FROM and table. The first three, namely SELECT, * and FROM are key words meaning words that have a fixed meaning in SQL. The token table is an identifier meaning it identifies (by name) an entity inside SQL such as a table (in this case), a column, etc…

As one can see, both key words and identifiers have the same lexical structure and thus one cannot know whether a token is one or the other without knowing the SQL language; the complete list of key words is available in the reserved appendix. Do note that key words are case-insensitive meaning the previous example can be written as:

select * fRoM table;

Identifiers however are not - as Elasticsearch is case sensitive, Elasticsearch SQL uses the received value verbatim.

To help differentiate between the two, through-out the documentation the SQL key words are upper-cased a convention we find increases readability and thus recommend to others.

Identifiers

edit

Identifiers can be of two types: quoted and unquoted:

SELECT ip_address FROM "hosts-*"

This query has two identifiers, ip_address and hosts-* (an index pattern). As ip_address does not clash with any key words it can be used verbatim, hosts-* on the other hand cannot as it clashes with - (minus operation) and * hence the double quotes.

Another example:

SELECT "from" FROM "<logstash-{now/d}>"

The first identifier from needs to quoted as otherwise it clashes with the FROM key word (which is case insensitive as thus can be written as from) while the second identifier using Elasticsearch Date math support in index names would have otherwise confuse the parser.

Hence why in general, especially when dealing with user input it is highly recommended to use quotes for identifiers. It adds minimal increase to your queries and in return offers clarity and disambiguation.

Literals (Constants)

edit

Elasticsearch SQL supports two kind of implicitly-typed literals: strings and numbers.

String Literals

edit

A string literal is an arbitrary number of characters bounded by single quotes ': 'Giant Robot'. To include a single quote in the string, escape it using another single quote: 'Captain EO''s Voyage'.

An escaped single quote is not a double quote ("), but a single quote ' repeated ('').

Numeric Literals

edit

Numeric literals are accepted both in decimal and scientific notation with exponent marker (e or E), starting either with a digit or decimal point .:

1969    -- integer notation
3.14    -- decimal notation
.1234   -- decimal notation starting with decimal point
4E5     -- scientific notation (with exponent marker)
1.2e-3  -- scientific notation with decimal point

Numeric literals that contain a decimal point are always interpreted as being of type double. Those without are considered integer if they fit otherwise their type is long (or BIGINT in ANSI SQL types).

Generic Literals

edit

When dealing with arbitrary type literal, one creates the object by casting, typically, the string representation to the desired type. This can be achieved through the dedicated functions:

CAST('1969-05-13T12:34:56' AS TIMESTAMP)    -- cast the given string to datetime
CONVERT('10.0.0.1', IP)                     -- cast '10.0.0.1' to an IP

Do note that Elasticsearch SQL provides functions that out of the box return popular literals (like E()) or provide dedicated parsing for certain strings.

Single vs Double Quotes

edit

It is worth pointing out that in SQL, single quotes ' and double quotes " have different meaning and cannot be used interchangeably. Single quotes are used to declare a string literal while double quotes for identifiers.

To wit:

SELECT "first_name" 
  FROM "musicians"  
 WHERE "last_name"  
     = 'Carroll'

	Double quotes `"` used for column and table identifiers
	Single quotes `'` used for a string literal

Special characters

edit

A few characters that are not alphanumeric have a dedicated meaning different from that of an operator. For completeness these are specified below:

Char	Description
`*`	The asterisk (or wildcard) is used in some contexts to denote all fields for a table. Can be also used as an argument to some aggregate functions.
`,`	Commas are used to enumerate the elements of a list.
`.`	Used in numeric constants or to separate identifiers qualifiers (catalog, table, column names, etc…).
`()`	Parentheses are used for specific SQL commands, function declarations or to enforce precedence.

Operators

edit

Most operators in Elasticsearch SQL have the same precedence and are left-associative. As this is done at parsing time, parenthesis need to be used to enforce a different precedence.

The following table indicates the supported operators and their precendence (highest to lowest);

Operator/Element	Associativity	Description
`.`	left	qualifier separator
`+ -`	right	unary plus and minus (numeric literal sign)
`* / %`	left	multiplication, division, modulo
`+ -`	left	addition, substraction
`BETWEEN IN LIKE`		range containment, string matching
`< > ⇐ >= = <⇒ <> !=`		comparison
`NOT`	right	logical negation
`AND`	left	logical conjunction
`OR`	left	logical disjunction

Comments

edit

Elasticsearch SQL allows comments which are sequence of characters ignored by the parsers.

Two styles are supported:

Single Line: Comments start with a double dash -- and continue until the end of the line.
Multi line: Comments that start with /* and end with */ (also known as C-style).

-- single line comment
/* multi
   line
   comment
   that supports /* nested comments */
   */

« SQL Language SQL Commands »

On this page

Key Words
Identifiers
Literals (Constants)
String Literals
Numeric Literals
Generic Literals
Single vs Double Quotes
Special characters
Operators
Comments

Was this helpful?

Feedback

The Search AI Company

ELK Stack

Elastic Cloud

Generative AI

Search

Security

Observability

By solution

Industries

Customer spotlight

Research

Build

Learn

Connect

Lexical Structure

Lexical Structure

Key Words

Identifiers

Literals (Constants)

String Literals

Numeric Literals

Generic Literals

Single vs Double Quotes

Special characters

Operators

Comments

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards