Retriever

edit

A retriever is a specification to describe top documents returned from a search. A retriever replaces other elements of the search API that also return top documents such as query and knn. A retriever may have child retrievers where a retriever with two or more children is considered a compound retriever. This allows for complex behavior to be depicted in a tree-like structure, called the retriever tree, which clarifies the order of operations that occur during a search.

Refer to Retrievers for a high level overview of the retrievers abstraction. Refer to Retrievers examples for additional examples.

The following retrievers are available:

standard
A retriever that replaces the functionality of a traditional query.
knn
A retriever that replaces the functionality of a knn search.
rrf
A retriever that produces top documents from reciprocal rank fusion (RRF).
text_similarity_reranker
A retriever that enhances search results by re-ranking documents based on semantic similarity to a specified inference text, using a machine learning model.
rule
A retriever that applies contextual Searching with query rules to pin or exclude documents for specific queries.

Standard Retriever

edit

A standard retriever returns top documents from a traditional query.

Parameters:

edit
query

(Optional, query object)

Defines a query to retrieve a set of top documents.

filter

(Optional, query object or list of query objects)

Applies a boolean query filter to this retriever, where all documents must match this query but do not contribute to the score.

search_after

(Optional, search after object)

Defines a search after object parameter used for pagination.

terminate_after

(Optional, integer) Maximum number of documents to collect for each shard. If a query reaches this limit, Elasticsearch terminates the query early. Elasticsearch collects documents before sorting.

Use with caution. Elasticsearch applies this parameter to each shard handling the request. When possible, let Elasticsearch perform early termination automatically. Avoid specifying this parameter for requests that target data streams with backing indices across multiple data tiers.

sort

(Optional, sort object) A sort object that specifies the order of matching documents.

min_score

(Optional, float)

Minimum _score for matching documents. Documents with a lower _score are not included in the top documents.

collapse

(Optional, collapse object)

Collapses the top documents by a specified key into a single top document per key.

Restrictions

edit

When a retriever tree contains a compound retriever (a retriever with two or more child retrievers) the search after parameter is not supported.

Example

edit
resp = client.search(
    index="restaurants",
    retriever={
        "standard": {
            "query": {
                "bool": {
                    "should": [
                        {
                            "match": {
                                "region": "Austria"
                            }
                        }
                    ],
                    "filter": [
                        {
                            "term": {
                                "year": "2019"
                            }
                        }
                    ]
                }
            }
        }
    },
)
print(resp)
const response = await client.search({
  index: "restaurants",
  retriever: {
    standard: {
      query: {
        bool: {
          should: [
            {
              match: {
                region: "Austria",
              },
            },
          ],
          filter: [
            {
              term: {
                year: "2019",
              },
            },
          ],
        },
      },
    },
  },
});
console.log(response);
GET /restaurants/_search
{
  "retriever": { 
    "standard": { 
      "query": { 
        "bool": { 
          "should": [ 
            {
              "match": { 
                "region": "Austria"
              }
            }
          ],
          "filter": [ 
            {
              "term": { 
                "year": "2019" 
              }
            }
          ]
        }
      }
    }
  }
}

Opens the retriever object.

The standard retriever is used for defining traditional Elasticsearch queries.

The entry point for defining the search query.

The bool object allows for combining multiple query clauses logically.

The should array indicates conditions under which a document will match. Documents matching these conditions will have increased relevancy scores.

The match object finds documents where the region field contains the word "Austria."

The filter array provides filtering conditions that must be met but do not contribute to the relevancy score.

The term object is used for exact matches, in this case, filtering documents by the year field.

The exact value to match in the year field.

kNN Retriever

edit

A kNN retriever returns top documents from a k-nearest neighbor search (kNN).

Parameters

edit
field

(Required, string)

The name of the vector field to search against. Must be a dense_vector field with indexing enabled.

query_vector

(Required if query_vector_builder is not defined, array of float)

Query vector. Must have the same number of dimensions as the vector field you are searching against. Must be either an array of floats or a hex-encoded byte vector.

query_vector_builder

(Required if query_vector is not defined, query vector builder object)

Defines a model to build a query vector.

k

(Required, integer)

Number of nearest neighbors to return as top hits. This value must be fewer than or equal to num_candidates.

num_candidates

(Required, integer)

The number of nearest neighbor candidates to consider per shard. Needs to be greater than k, or size if k is omitted, and cannot exceed 10,000. Elasticsearch collects num_candidates results from each shard, then merges them to find the top k results. Increasing num_candidates tends to improve the accuracy of the final k results. Defaults to Math.min(1.5 * k, 10_000).

filter

(Optional, query object or list of query objects)

Query to filter the documents that can match. The kNN search will return the top k documents that also match this filter. The value can be a single query or a list of queries. If filter is not provided, all documents are allowed to match.

similarity

(Optional, float)

The minimum similarity required for a document to be considered a match. The similarity value calculated relates to the raw similarity used. Not the document score. The matched documents are then scored according to similarity and the provided boost is applied.

The similarity parameter is the direct vector similarity calculation.

  • l2_norm: also known as Euclidean, will include documents where the vector is within the dims dimensional hypersphere with radius similarity with origin at query_vector.
  • cosine, dot_product, and max_inner_product: Only return vectors where the cosine similarity or dot-product are at least the provided similarity.

Read more here: knn similarity search

Restrictions

edit

The parameters query_vector and query_vector_builder cannot be used together.

Example

edit
resp = client.search(
    index="restaurants",
    retriever={
        "knn": {
            "field": "vector",
            "query_vector": [
                10,
                22,
                77
            ],
            "k": 10,
            "num_candidates": 10
        }
    },
)
print(resp)
const response = await client.search({
  index: "restaurants",
  retriever: {
    knn: {
      field: "vector",
      query_vector: [10, 22, 77],
      k: 10,
      num_candidates: 10,
    },
  },
});
console.log(response);
GET /restaurants/_search
{
  "retriever": {
    "knn": { 
      "field": "vector", 
      "query_vector": [10, 22, 77], 
      "k": 10, 
      "num_candidates": 10 
    }
  }
}

Configuration for k-nearest neighbor (knn) search, which is based on vector similarity.

Specifies the field name that contains the vectors.

The query vector against which document vectors are compared in the knn search.

The number of nearest neighbors to return as top hits. This value must be fewer than or equal to num_candidates.

The size of the initial candidate set from which the final k nearest neighbors are selected.

RRF Retriever

edit

An RRF retriever returns top documents based on the RRF formula, equally weighting two or more child retrievers. Reciprocal rank fusion (RRF) is a method for combining multiple result sets with different relevance indicators into a single result set.

Parameters

edit
retrievers

(Required, array of retriever objects)

A list of child retrievers to specify which sets of returned top documents will have the RRF formula applied to them. Each child retriever carries an equal weight as part of the RRF formula. Two or more child retrievers are required.

rank_constant

(Optional, integer)

This value determines how much influence documents in individual result sets per query have over the final ranked result set. A higher value indicates that lower ranked documents have more influence. This value must be greater than or equal to 1. Defaults to 60.

rank_window_size

(Optional, integer)

This value determines the size of the individual result sets per query. A higher value will improve result relevance at the cost of performance. The final ranked result set is pruned down to the search request’s size. rank_window_size must be greater than or equal to size and greater than or equal to 1. Defaults to the size parameter.

filter

(Optional, query object or list of query objects)

Applies the specified boolean query filter to all of the specified sub-retrievers, according to each retriever’s specifications.

Example: Hybrid search

edit

A simple hybrid search example (lexical search + dense vector search) combining a standard retriever with a knn retriever using RRF:

resp = client.search(
    index="restaurants",
    retriever={
        "rrf": {
            "retrievers": [
                {
                    "standard": {
                        "query": {
                            "multi_match": {
                                "query": "Austria",
                                "fields": [
                                    "city",
                                    "region"
                                ]
                            }
                        }
                    }
                },
                {
                    "knn": {
                        "field": "vector",
                        "query_vector": [
                            10,
                            22,
                            77
                        ],
                        "k": 10,
                        "num_candidates": 10
                    }
                }
            ],
            "rank_constant": 1,
            "rank_window_size": 50
        }
    },
)
print(resp)
const response = await client.search({
  index: "restaurants",
  retriever: {
    rrf: {
      retrievers: [
        {
          standard: {
            query: {
              multi_match: {
                query: "Austria",
                fields: ["city", "region"],
              },
            },
          },
        },
        {
          knn: {
            field: "vector",
            query_vector: [10, 22, 77],
            k: 10,
            num_candidates: 10,
          },
        },
      ],
      rank_constant: 1,
      rank_window_size: 50,
    },
  },
});
console.log(response);
GET /restaurants/_search
{
  "retriever": {
    "rrf": { 
      "retrievers": [ 
        {
          "standard": { 
            "query": {
              "multi_match": {
                "query": "Austria",
                "fields": [
                  "city",
                  "region"
                ]
              }
            }
          }
        },
        {
          "knn": { 
            "field": "vector",
            "query_vector": [10, 22, 77],
            "k": 10,
            "num_candidates": 10
          }
        }
      ],
      "rank_constant": 1, 
      "rank_window_size": 50  
    }
  }
}

Defines a retriever tree with an RRF retriever.

The sub-retriever array.

The first sub-retriever is a standard retriever.

The second sub-retriever is a knn retriever.

The rank constant for the RRF retriever.

The rank window size for the RRF retriever.

Example: Hybrid search with sparse vectors

edit

A more complex hybrid search example (lexical search + ELSER sparse vector search + dense vector search) using RRF:

resp = client.search(
    index="movies",
    retriever={
        "rrf": {
            "retrievers": [
                {
                    "standard": {
                        "query": {
                            "sparse_vector": {
                                "field": "plot_embedding",
                                "inference_id": "my-elser-model",
                                "query": "films that explore psychological depths"
                            }
                        }
                    }
                },
                {
                    "standard": {
                        "query": {
                            "multi_match": {
                                "query": "crime",
                                "fields": [
                                    "plot",
                                    "title"
                                ]
                            }
                        }
                    }
                },
                {
                    "knn": {
                        "field": "vector",
                        "query_vector": [
                            10,
                            22,
                            77
                        ],
                        "k": 10,
                        "num_candidates": 10
                    }
                }
            ]
        }
    },
)
print(resp)
const response = await client.search({
  index: "movies",
  retriever: {
    rrf: {
      retrievers: [
        {
          standard: {
            query: {
              sparse_vector: {
                field: "plot_embedding",
                inference_id: "my-elser-model",
                query: "films that explore psychological depths",
              },
            },
          },
        },
        {
          standard: {
            query: {
              multi_match: {
                query: "crime",
                fields: ["plot", "title"],
              },
            },
          },
        },
        {
          knn: {
            field: "vector",
            query_vector: [10, 22, 77],
            k: 10,
            num_candidates: 10,
          },
        },
      ],
    },
  },
});
console.log(response);
GET movies/_search
{
  "retriever": {
    "rrf": {
      "retrievers": [
        {
          "standard": {
            "query": {
              "sparse_vector": {
                "field": "plot_embedding",
                "inference_id": "my-elser-model",
                "query": "films that explore psychological depths"
              }
            }
          }
        },
        {
          "standard": {
            "query": {
              "multi_match": {
                "query": "crime",
                "fields": [
                  "plot",
                  "title"
                ]
              }
            }
          }
        },
        {
          "knn": {
            "field": "vector",
            "query_vector": [10, 22, 77],
            "k": 10,
            "num_candidates": 10
          }
        }
      ]
    }
  }
}

Text Similarity Re-ranker Retriever

edit

The text_similarity_reranker retriever uses an NLP model to improve search results by reordering the top-k documents based on their semantic similarity to the query.

Refer to Semantic re-ranking for a high level overview of semantic re-ranking.

Prerequisites

edit

To use text_similarity_reranker you must first set up an inference endpoint for the rerank task using the Create inference API. The endpoint should be set up with a machine learning model that can compute text similarity. Refer to the Elastic NLP model reference for a list of third-party text similarity models supported by Elasticsearch.

You have the following options:

Parameters

edit
retriever

(Required, retriever)

The child retriever that generates the initial set of top documents to be re-ranked.

field

(Required, string)

The document field to be used for text similarity comparisons. This field should contain the text that will be evaluated against the inferenceText.

inference_id

(Required, string)

Unique identifier of the inference endpoint created using the inference API.

inference_text

(Required, string)

The text snippet used as the basis for similarity comparison.

rank_window_size

(Optional, int)

The number of top documents to consider in the re-ranking process. Defaults to 10.

min_score

(Optional, float)

Sets a minimum threshold score for including documents in the re-ranked results. Documents with similarity scores below this threshold will be excluded. Note that score calculations vary depending on the model used.

filter

(Optional, query object or list of query objects)

Applies the specified boolean query filter to the child retriever. If the child retriever already specifies any filters, then this top-level filter is applied in conjuction with the filter defined in the child retriever.

Example: Elastic Rerank

edit

This examples demonstrates how to deploy the Elastic Rerank model and use it to re-rank search results using the text_similarity_reranker retriever.

Follow these steps:

  1. Create an inference endpoint for the rerank task using the Create inference API.

    const response = await client.inference.put({
      task_type: "rerank",
      inference_id: "my-elastic-rerank",
      inference_config: {
        service: "elasticsearch",
        service_settings: {
          model_id: ".rerank-v1",
          num_threads: 1,
          adaptive_allocations: {
            enabled: true,
            min_number_of_allocations: 1,
            max_number_of_allocations: 10,
          },
        },
      },
    });
    console.log(response);
    PUT _inference/rerank/my-elastic-rerank
    {
      "service": "elasticsearch",
      "service_settings": {
        "model_id": ".rerank-v1",
        "num_threads": 1,
        "adaptive_allocations": { 
          "enabled": true,
          "min_number_of_allocations": 1,
          "max_number_of_allocations": 10
        }
      }
    }

    Adaptive allocations will be enabled with the minimum of 1 and the maximum of 10 allocations.

  2. Define a text_similarity_rerank retriever:

    const response = await client.search({
      retriever: {
        text_similarity_reranker: {
          retriever: {
            standard: {
              query: {
                match: {
                  text: "How often does the moon hide the sun?",
                },
              },
            },
          },
          field: "text",
          inference_id: "my-elastic-rerank",
          inference_text: "How often does the moon hide the sun?",
          rank_window_size: 100,
          min_score: 0.5,
        },
      },
    });
    console.log(response);
    POST _search
    {
      "retriever": {
        "text_similarity_reranker": {
          "retriever": {
            "standard": {
              "query": {
                "match": {
                  "text": "How often does the moon hide the sun?"
                }
              }
            }
          },
          "field": "text",
          "inference_id": "my-elastic-rerank",
          "inference_text": "How often does the moon hide the sun?",
          "rank_window_size": 100,
          "min_score": 0.5
        }
      }
    }

Example: Cohere Rerank

edit

This example enables out-of-the-box semantic search by re-ranking top documents using the Cohere Rerank API. This approach eliminates the need to generate and store embeddings for all indexed documents. This requires a Cohere Rerank inference endpoint that is set up for the rerank task type.

resp = client.search(
    index="index",
    retriever={
        "text_similarity_reranker": {
            "retriever": {
                "standard": {
                    "query": {
                        "match_phrase": {
                            "text": "landmark in Paris"
                        }
                    }
                }
            },
            "field": "text",
            "inference_id": "my-cohere-rerank-model",
            "inference_text": "Most famous landmark in Paris",
            "rank_window_size": 100,
            "min_score": 0.5
        }
    },
)
print(resp)
const response = await client.search({
  index: "index",
  retriever: {
    text_similarity_reranker: {
      retriever: {
        standard: {
          query: {
            match_phrase: {
              text: "landmark in Paris",
            },
          },
        },
      },
      field: "text",
      inference_id: "my-cohere-rerank-model",
      inference_text: "Most famous landmark in Paris",
      rank_window_size: 100,
      min_score: 0.5,
    },
  },
});
console.log(response);
GET /index/_search
{
   "retriever": {
      "text_similarity_reranker": {
         "retriever": {
            "standard": {
               "query": {
                  "match_phrase": {
                     "text": "landmark in Paris"
                  }
               }
            }
         },
         "field": "text",
         "inference_id": "my-cohere-rerank-model",
         "inference_text": "Most famous landmark in Paris",
         "rank_window_size": 100,
         "min_score": 0.5
      }
   }
}

Example: Semantic re-ranking with a Hugging Face model

edit

The following example uses the cross-encoder/ms-marco-MiniLM-L-6-v2 model from Hugging Face to rerank search results based on semantic similarity. The model must be uploaded to Elasticsearch using Eland.

Refer to the Elastic NLP model reference for a list of third party text similarity models supported by Elasticsearch.

Follow these steps to load the model and create a semantic re-ranker.

  1. Install Eland using pip

    python -m pip install eland[pytorch]
  2. Upload the model to Elasticsearch using Eland. This example assumes you have an Elastic Cloud deployment and an API key. Refer to the Eland documentation for more authentication options.

    eland_import_hub_model \
      --cloud-id $CLOUD_ID \
      --es-api-key $ES_API_KEY \
      --hub-model-id cross-encoder/ms-marco-MiniLM-L-6-v2 \
      --task-type text_similarity \
      --clear-previous \
      --start
  3. Create an inference endpoint for the rerank task

    resp = client.inference.put(
        task_type="rerank",
        inference_id="my-msmarco-minilm-model",
        inference_config={
            "service": "elasticsearch",
            "service_settings": {
                "num_allocations": 1,
                "num_threads": 1,
                "model_id": "cross-encoder__ms-marco-minilm-l-6-v2"
            }
        },
    )
    print(resp)
    const response = await client.inference.put({
      task_type: "rerank",
      inference_id: "my-msmarco-minilm-model",
      inference_config: {
        service: "elasticsearch",
        service_settings: {
          num_allocations: 1,
          num_threads: 1,
          model_id: "cross-encoder__ms-marco-minilm-l-6-v2",
        },
      },
    });
    console.log(response);
    PUT _inference/rerank/my-msmarco-minilm-model
    {
      "service": "elasticsearch",
      "service_settings": {
        "num_allocations": 1,
        "num_threads": 1,
        "model_id": "cross-encoder__ms-marco-minilm-l-6-v2"
      }
    }
  4. Define a text_similarity_rerank retriever.

    resp = client.search(
        index="movies",
        retriever={
            "text_similarity_reranker": {
                "retriever": {
                    "standard": {
                        "query": {
                            "match": {
                                "genre": "drama"
                            }
                        }
                    }
                },
                "field": "plot",
                "inference_id": "my-msmarco-minilm-model",
                "inference_text": "films that explore psychological depths"
            }
        },
    )
    print(resp)
    const response = await client.search({
      index: "movies",
      retriever: {
        text_similarity_reranker: {
          retriever: {
            standard: {
              query: {
                match: {
                  genre: "drama",
                },
              },
            },
          },
          field: "plot",
          inference_id: "my-msmarco-minilm-model",
          inference_text: "films that explore psychological depths",
        },
      },
    });
    console.log(response);
    POST movies/_search
    {
      "retriever": {
        "text_similarity_reranker": {
          "retriever": {
            "standard": {
              "query": {
                "match": {
                  "genre": "drama"
                }
              }
            }
          },
          "field": "plot",
          "inference_id": "my-msmarco-minilm-model",
          "inference_text": "films that explore psychological depths"
        }
      }
    }

    This retriever uses a standard match query to search the movie index for films tagged with the genre "drama". It then re-ranks the results based on semantic similarity to the text in the inference_text parameter, using the model we uploaded to Elasticsearch.

Query Rules Retriever

edit

The rule retriever enables fine-grained control over search results by applying contextual query rules to pin or exclude documents for specific queries. This retriever has similar functionality to the rule query, but works out of the box with other retrievers.

Prerequisites

edit

To use the rule retriever you must first create one or more query rulesets using the query rules management APIs.

Parameters
edit
retriever

(Required, retriever)

The child retriever that returns the results to apply query rules on top of. This can be a standalone retriever such as the standard or knn retriever, or it can be a compound retriever.

ruleset_ids

(Required, array)

An array of one or more unique query ruleset IDs with query-based rules to match and apply as applicable. Rulesets and their associated rules are evaluated in the order in which they are specified in the query and ruleset. The maximum number of rulesets to specify is 10.

match_criteria

(Required, object)

Defines the match criteria to apply to rules in the given query ruleset(s). Match criteria should match the keys defined in the criteria.metadata field of the rule.

rank_window_size

(Optional, int)

The number of top documents to return from the rule retriever. Defaults to 10.

Example: Rule retriever

edit

This example shows the rule retriever executed without any additional retrievers. It runs the query defined by the retriever and applies the rules from my-ruleset on top of the returned results.

resp = client.search(
    index="movies",
    retriever={
        "rule": {
            "match_criteria": {
                "query_string": "harry potter"
            },
            "ruleset_ids": [
                "my-ruleset"
            ],
            "retriever": {
                "standard": {
                    "query": {
                        "query_string": {
                            "query": "harry potter"
                        }
                    }
                }
            }
        }
    },
)
print(resp)
const response = await client.search({
  index: "movies",
  retriever: {
    rule: {
      match_criteria: {
        query_string: "harry potter",
      },
      ruleset_ids: ["my-ruleset"],
      retriever: {
        standard: {
          query: {
            query_string: {
              query: "harry potter",
            },
          },
        },
      },
    },
  },
});
console.log(response);
GET movies/_search
{
  "retriever": {
    "rule": {
      "match_criteria": {
        "query_string": "harry potter"
      },
      "ruleset_ids": [
        "my-ruleset"
      ],
      "retriever": {
        "standard": {
          "query": {
            "query_string": {
              "query": "harry potter"
            }
          }
        }
      }
    }
  }
}

Example: Rule retriever combined with RRF

edit

This example shows how to combine the rule retriever with other rerank retrievers such as rrf or text_similarity_reranker.

The rule retriever will apply rules to any documents returned from its defined retriever or any of its sub-retrievers. This means that for the best results, the rule retriever should be the outermost defined retriever. Nesting a rule retriever as a sub-retriever under a reranker such as rrf or text_similarity_reranker may not produce the expected results.

resp = client.search(
    index="movies",
    retriever={
        "rule": {
            "match_criteria": {
                "query_string": "harry potter"
            },
            "ruleset_ids": [
                "my-ruleset"
            ],
            "retriever": {
                "rrf": {
                    "retrievers": [
                        {
                            "standard": {
                                "query": {
                                    "query_string": {
                                        "query": "sorcerer's stone"
                                    }
                                }
                            }
                        },
                        {
                            "standard": {
                                "query": {
                                    "query_string": {
                                        "query": "chamber of secrets"
                                    }
                                }
                            }
                        }
                    ]
                }
            }
        }
    },
)
print(resp)
const response = await client.search({
  index: "movies",
  retriever: {
    rule: {
      match_criteria: {
        query_string: "harry potter",
      },
      ruleset_ids: ["my-ruleset"],
      retriever: {
        rrf: {
          retrievers: [
            {
              standard: {
                query: {
                  query_string: {
                    query: "sorcerer's stone",
                  },
                },
              },
            },
            {
              standard: {
                query: {
                  query_string: {
                    query: "chamber of secrets",
                  },
                },
              },
            },
          ],
        },
      },
    },
  },
});
console.log(response);
GET movies/_search
{
  "retriever": {
    "rule": { 
      "match_criteria": {
        "query_string": "harry potter"
      },
      "ruleset_ids": [
        "my-ruleset"
      ],
      "retriever": {
        "rrf": { 
          "retrievers": [
            {
              "standard": {
                "query": {
                  "query_string": {
                    "query": "sorcerer's stone"
                  }
                }
              }
            },
            {
              "standard": {
                "query": {
                  "query_string": {
                    "query": "chamber of secrets"
                  }
                }
              }
            }
          ]
        }
      }
    }
  }
}

The rule retriever is the outermost retriever, applying rules to the search results that were previously reranked using the rrf retriever.

The rrf retriever returns results from all of its sub-retrievers, and the output of the rrf retriever is used as input to the rule retriever.

Common usage guidelines

edit

Using from and size with a retriever tree

edit

The from and size parameters are provided globally as part of the general search API. They are applied to all retrievers in a retriever tree, unless a specific retriever overrides the size parameter using a different parameter such as rank_window_size. Though, the final search hits are always limited to size.

Using aggregations with a retriever tree

edit

Aggregations are globally specified as part of a search request. The query used for an aggregation is the combination of all leaf retrievers as should clauses in a boolean query.

Restrictions on search parameters when specifying a retriever

edit

When a retriever is specified as part of a search, the following elements are not allowed at the top-level. Instead they are only allowed as elements of specific retrievers: