New to the Elasticsearch repository: ES|QL
Updated on August 17, 2023, to indicate ES|QL branch has been merged into the Elasticsearch main branch.
It is my pleasure to announce that after roughly one year of development, Elasticsearch Query Language (ES|QL) is ready to be shared with the world and has landed in the Elasticsearch® repository. ES|QL is a powerful declarative language native to Elasticsearch and designed for composability, expressiveness, and speed.
Why another language?
Elasticsearch supports a number of languages, from the venerable queryDSL to EQL, KQL, SQL, Painless, Canvas/Timelion, and others — as its adoption increased so did the audience and its needs. It’s no longer just “full text search.” It's a lot of things, such as log exploration, threat hunting, reporting, alerts, and custom processing.
As consumers of our own product, we wanted a single, consolidated way for interacting with Elasticsearch — one that brings comprehensive compute capabilities close to the data and eliminates the need of expensive transfers to external systems for custom processing.
Here is an ES|QL query taken from a test suite using the MySQL sample employees data set (with a number of modifications):
FROM employees
| EVAL hired_year = TO_INTEGER(DATE_FORMAT(hire_date, "YYYY"))
| WHERE hired_year > 1984
| STATS avg_salary = AVG(salary) BY languages
| EVAL avg_salary = ROUND(avg_salary)
| EVAL lang_code = TO_STRING(languages)
| ENRICH languages_policy ON lang_code WITH lang = language_name
| WHERE lang IS NOT NULL
| KEEP avg_salary, lang
| SORT avg_salary ASC
| LIMIT 3
That returns a response similar to:
{
"columns": [
{"name": "avg_salary", "type": "double"},
{"name": "lang", "type": "keyword"}
],
"rows": [
["43760.0", "Spanish"],
["48644.0", "French"],
["48832.0", "German"]
]
}
A single ES|QL query performs filtering, processing, grouping, renaming, sorting, look-ups, and column pruning.
The query flows top to bottom, just like the data. One can chain an arbitrary number of commands as necessary, reorder them, and use the built-in functions or its own evaluations. ES|QL offers a unified query experience that’s easier and more powerful than the existing querying interfaces, which will continue to be available.
Dedicated query engine
ES|QL is not just a language, but also a full-blown, specialized query and compute engine for Elasticsearch. There is no translation or transpilation to QueryDSL: all ES|QL queries are lexed and parsed, resolved and semantically analyzed, and verified and optimized followed by a planning phase for distributed execution against the data in the cluster. The designated target nodes are responsible for the local execution and exploit the local data characteristics by performing their own local replanning using the ES|QL infrastructure.
ES|QL brings a new execution engine designed with performance in mind — one that operates on blocks at a time instead of per row, targets vectorization and cache locality, and embraces specialization and multi-threading. It is a separate component from the existing Elasticsearch aggregation framework with different performance characteristics. In our current benchmarks, there are significant improvements on several aggregations (smaller is better).
The goal is to provide different capabilities, such as performing multiple chained groupings:
POST /_query?format=txt
{
"query" : """
FROM employees
| STATS c = COUNT(emp_no) BY languages
| STATS most_speakers_of_a_lang = MAX(c)
"""
}
most_speakers_of_a_lang
-----------------------
21
Keep an eye out for future posts where we’ll explore the features, design decisions, and architecture of ES|QL.
Where can I get it?
The ES|QL code is available in the Elasticsearch main branch and will be released as a tech preview in Elasticsearch. It is a free tier feature and available to everyone. Nightly snapshots will be available for download shortly, so feel free to check out the code and build it yourself. Take ES|QL for a spin and start exploring your local data — read the docs here.
Because we’re in the early stages of ES|QL and there might be a few loose ends, bumps, or even a bug, please file an issue. We cannot wait to share ES|QL with the Elasticsearch community!
On behalf of the ES|QL team, we're looking forward to your feedback!
The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all.