Text expansion query
editText expansion query
editThe text expansion query uses a natural language processing model to convert the query text into a list of token-weight pairs which are then used in a query against a sparse vector or rank features field.
Example request
editresponse = client.search( body: { query: { text_expansion: { "<sparse_vector_field>": { model_id: 'the model to produce the token weights', model_text: 'the query string' } } } } ) puts response
GET _search { "query":{ "text_expansion":{ "<sparse_vector_field>":{ "model_id":"the model to produce the token weights", "model_text":"the query string" } } } }
Top level parameters for text_expansion
edit-
<sparse_vector_field>
- (Required, object) The name of the field that contains the token-weight pairs the NLP model created based on the input text.
Top level parameters for <sparse_vector_field>
edit-
model_id
- (Required, string) The ID of the model to use to convert the query text into token-weight pairs. It must be the same model ID that was used to create the tokens from the input text.
-
model_text
- (Required, string) The query text you want to use for search.
-
pruning_config
-
(Optional, object) [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. Optional pruning configuration. If enabled, this will omit non-significant tokens from the query in order to improve query performance. Default: Disabled.
Parameters for
<pruning_config>
are:-
tokens_freq_ratio_threshold
-
(Optional, integer)
[preview]
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
Tokens whose frequency is more than
tokens_freq_ratio_threshold
times the average frequency of all tokens in the specified field are considered outliers and pruned. This value must between 1 and 100. Default:5
. -
tokens_weight_threshold
-
(Optional, float)
[preview]
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
Tokens whose weight is less than
tokens_weight_threshold
are considered nonsignificant and pruned. This value must be between 0 and 1. Default:0.4
. -
only_score_pruned_tokens
-
(Optional, boolean)
[preview]
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
If
true
we only input pruned tokens into scoring, and discard non-pruned tokens. It is strongly recommended to set this tofalse
for the main query, but this can be set totrue
for a rescore query to get more relevant results. Default:false
.
The default values for
tokens_freq_ratio_threshold
andtokens_weight_threshold
were chosen based on tests using ELSER that provided the most optimal results. -
Example ELSER query
editThe following is an example of the text_expansion
query that references the
ELSER model to perform semantic search. For a more detailed description of how
to perform semantic search by using ELSER and the text_expansion
query, refer
to this tutorial.
response = client.search( index: 'my-index', body: { query: { text_expansion: { 'ml.tokens' => { model_id: '.elser_model_2', model_text: 'How is the weather in Jamaica?' } } } } ) puts response
GET my-index/_search { "query":{ "text_expansion":{ "ml.tokens":{ "model_id":".elser_model_2", "model_text":"How is the weather in Jamaica?" } } } }
Multiple text_expansion
queries can be combined with each other or other query types.
This can be achieved by wrapping them in boolean query clauses and using linear boosting:
GET my-index/_search { "query": { "bool": { "should": [ { "text_expansion": { "ml.inference.title_expanded.predicted_value": { "model_id": ".elser_model_2", "model_text": "How is the weather in Jamaica?", "boost": 1 } } }, { "text_expansion": { "ml.inference.description_expanded.predicted_value": { "model_id": ".elser_model_2", "model_text": "How is the weather in Jamaica?", "boost": 1 } } }, { "multi_match": { "query": "How is the weather in Jamaica?", "fields": [ "title", "description" ], "boost": 4 } } ] } } }
This can also be achieved by using sub searches combined with Reciprocal rank fusion.
GET my-index/_search { "sub_searches": [ { "query": { "multi_match": { "query": "How is the weather in Jamaica?", "fields": [ "title", "description" ] } } }, { "query": { "text_expansion": { "ml.inference.title_expanded.predicted_value": { "model_id": ".elser_model_2", "model_text": "How is the weather in Jamaica?" } } } }, { "query": { "text_expansion": { "ml.inference.description_expanded.predicted_value": { "model_id": ".elser_model_2", "model_text": "How is the weather in Jamaica?" } } } } ], "rank": { "rrf": { "window_size": 10, "rank_constant": 20 } } }
Example ELSER query with pruning configuration and rescore
editThe following is an extension to the above example that adds a
[preview]
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
pruning configuration to the text_expansion
query.
The pruning configuration identifies non-significant tokens to prune from the query in order to improve query performance.
Token pruning happens at the shard level.
While this should result in the same tokens being labeled as insignificant across shards, this is not guaranteed based on the composition of each shard.
Therefore, if you are running text_expansion
with a pruning_config
on a multi-shard index, we strongly recommend adding a Rescore filtered search results function with the tokens that were originally pruned from the query.
This will help mitigate any shard-level inconsistency with pruned tokens and provide better relevance overall.
GET my-index/_search { "query":{ "text_expansion":{ "ml.tokens":{ "model_id":".elser_model_2", "model_text":"How is the weather in Jamaica?", "pruning_config": { "tokens_freq_ratio_threshold": 5, "tokens_weight_threshold": 0.4, "only_score_pruned_tokens": false } } } }, "rescore": { "window_size": 100, "query": { "rescore_query": { "text_expansion": { "ml.tokens": { "model_id": ".elser_model_2", "model_text": "How is the weather in Jamaica?", "pruning_config": { "tokens_freq_ratio_threshold": 5, "tokens_weight_threshold": 0.4, "only_score_pruned_tokens": true } } } } } } }
Depending on your data, the text expansion query may be faster with track_total_hits: false
.