Data enrichment
editData enrichment
editThe ES|QL ENRICH
processing command combines, at query-time, data from
one or more source indexes with field-value combinations found in Elasticsearch enrich
indexes.
For example, you can use ENRICH
to:
- Identify web services or vendors based on known IP addresses
- Add product information to retail orders based on product IDs
- Supplement contact information based on an email address
How the ENRICH
command works
editThe ENRICH
command adds new columns to a table, with data from Elasticsearch indices.
It requires a few special components:
- Enrich policy
-
A set of configuration options used to add the right enrich data to the input table.
An enrich policy contains:
- A list of one or more source indices which store enrich data as documents
- The policy type which determines how the processor matches the enrich data to incoming documents
- A match field from the source indices used to match incoming documents
- Enrich fields containing enrich data from the source indices you want to add to incoming documents
After creating a policy, it must be executed before it can be used. Executing an enrich policy uses data from the policy’s source indices to create a streamlined system index called the enrich index. The
ENRICH
command uses this index to match and enrich an input table.
- Source index
-
An index which stores enrich data that the
ENRICH
command can add to input tables. You can create and manage these indices just like a regular Elasticsearch index. You can use multiple source indices in an enrich policy. You also can use the same source index in multiple enrich policies.
- Enrich index
-
A special system index tied to a specific enrich policy.
Directly matching rows from input tables to documents in source indices could be slow and resource intensive. To speed things up, the
ENRICH
command uses an enrich index.Enrich indices contain enrich data from source indices but have a few special properties to help streamline them:
-
They are system indices, meaning they’re managed internally by Elasticsearch and only
intended for use with enrich processors and the ES|QL
ENRICH
command. -
They always begin with
.enrich-*
. - They are read-only, meaning you can’t directly change them.
- They are force merged for fast retrieval.
-
They are system indices, meaning they’re managed internally by Elasticsearch and only
intended for use with enrich processors and the ES|QL
Set up an enrich policy
editTo start using ENRICH
, follow these steps:
Once you have enrich policies set up, you can update your enrich data and update your enrich policies.
The ENRICH
command performs several operations and may impact the speed of
your query.
Prerequisites
editTo use enrich policies, you must have:
-
read
index privileges for any indices used -
The
enrich_user
built-in role
Add enrich data
editTo begin, add documents to one or more source indices. These documents should contain the enrich data you eventually want to add to incoming data.
You can manage source indices just like regular Elasticsearch indices using the document and index APIs.
You also can set up Beats, such as a Filebeat, to automatically send and index documents to your source indices. See Getting started with Beats.
Create an enrich policy
editAfter adding enrich data to your source indices, use the create enrich policy API or Index Management in Kibana to create an enrich policy.
Once created, you can’t update or change an enrich policy. See Update an enrich policy.
Execute the enrich policy
editOnce the enrich policy is created, you need to execute it using the execute enrich policy API or Index Management in Kibana to create an enrich index.
The enrich index contains documents from the policy’s source indices.
Enrich indices always begin with .enrich-*
,
are read-only,
and are force merged.
Enrich indices should only be used by the enrich processor
or the ES|QL ENRICH
command. Avoid using enrich indices for
other purposes.
Use the enrich policy
editAfter the policy has been executed, you can use the ENRICH
command to enrich your data.
The following example uses the languages_policy
enrich policy to add a new
column for each enrich field defined in the policy. The match is performed using
the match_field
defined in the enrich policy and
requires that the input table has a column with the same name (language_code
in this example). ENRICH
will look for records in the
enrich index based on the match field value.
ROW language_code = "1" | ENRICH languages_policy
language_code:keyword | language_name:keyword |
---|---|
1 |
English |
To use a column with a different name than the match_field
defined in the
policy as the match field, use ON <column-name>
:
ROW a = "1" | ENRICH languages_policy ON a
a:keyword | language_name:keyword |
---|---|
1 |
English |
By default, each of the enrich fields defined in the policy is added as a
column. To explicitly select the enrich fields that are added, use
WITH <field1>, <field2>, ...
:
ROW a = "1" | ENRICH languages_policy ON a WITH language_name
a:keyword | language_name:keyword |
---|---|
1 |
English |
You can rename the columns that are added using WITH new_name=<field1>
:
ROW a = "1" | ENRICH languages_policy ON a WITH name = language_name
a:keyword | name:keyword |
---|---|
1 |
English |
In case of name collisions, the newly created columns will override existing columns.
Update an enrich index
editOnce created, you cannot update or index documents to an enrich index. Instead, update your source indices and execute the enrich policy again. This creates a new enrich index from your updated source indices. The previous enrich index will deleted with a delayed maintenance job. By default this is done every 15 minutes.
Update an enrich policy
editOnce created, you can’t update or change an enrich policy. Instead, you can:
- Create and execute a new enrich policy.
- Replace the previous enrich policy with the new enrich policy in any in-use enrich processors or ES|QL queries.
- Use the delete enrich policy API or Index Management in Kibana to delete the previous enrich policy.
Enrich Policy Types and Limitations
editThe ES|QL ENRICH
command supports all three enrich policy types:
-
geo_match
-
Matches enrich data to incoming documents based on a
geo_shape
query. For an example, see Example: Enrich your data based on geolocation. -
match
-
Matches enrich data to incoming documents based on a
term
query. For an example, see Example: Enrich your data based on exact values. -
range
-
Matches a number, date, or IP address in incoming documents to a range in the
enrich index based on a
term
query. For an example, see Example: Enrich your data by matching a value to a range.
While all three enrich policy types are supported, there are some limitations to be aware of:
-
The
geo_match
enrich policy type only supports theintersects
spatial relation. -
It is required that the
match_field
in theENRICH
command is of the correct type. For example, if the enrich policy is of typegeo_match
, thematch_field
in theENRICH
command must be of typegeo_point
orgeo_shape
. Likewise, arange
enrich policy requires amatch_field
of typeinteger
,long
,date
, orip
, depending on the type of the range field in the original enrich index. -
However, this constraint is relaxed for
range
policies when thematch_field
is of typeKEYWORD
. In this case the field values will be parsed during query execution, row by row. If any value fails to parse, the output values for that row will be set tonull
, an appropriate warning will be produced and the query will continue to execute.