Semantic search
editSemantic search
editSemantic search is a search method that helps you find data based on the intent and contextual meaning of a search query, instead of a match on query terms (lexical search).
Elasticsearch provides various semantic search capabilities using natural language processing (NLP) and vector search. Using an NLP model enables you to extract text embeddings out of text. Embeddings are vectors that provide a numeric representation of a text. Pieces of content with similar meaning have similar representations.
You have several options for using NLP models in the Elastic Stack:
-
use the
semantic_text
workflow (recommended) - use the inference API workflow
- deploy models directly in Elasticsearch
Refer to this section to choose your workflow.
You can also store your own embeddings in Elasticsearch as vectors. Refer to this section for guidance on which query type to use for semantic search.
At query time, Elasticsearch can use the same NLP model to convert a query into embeddings, enabling you to find documents with similar text embeddings.
Choose a semantic search workflow
editsemantic_text
workflow
editThe simplest way to use NLP models in the Elastic Stack is through the semantic_text
workflow.
We recommend using this approach because it abstracts away a lot of manual work.
All you need to do is create an inference endpoint and an index mapping to start ingesting, embedding, and querying data.
There is no need to define model-related settings and parameters, or to create inference ingest pipelines.
Refer to the Create an inference endpoint API documentation for a list of supported services.
The Semantic search with semantic_text
tutorial shows you the process end-to-end.
inference API workflow
editThe inference API workflow is more complex but offers greater control over the inference endpoint configuration. You need to create an inference endpoint, provide various model-related settings and parameters, define an index mapping, and set up an inference ingest pipeline with the appropriate settings.
The Semantic search with the inference API tutorial shows you the process end-to-end.
Model deployment workflow
editYou can also deploy NLP in Elasticsearch manually, without using an inference endpoint. This is the most complex and labor intensive workflow for performing semantic search in the Elastic Stack. You need to select an NLP model from the list of supported dense and sparse vector models, deploy it using the Eland client, create an index mapping, and set up a suitable ingest pipeline to start ingesting and querying data.
The Semantic search with a model deployed in Elasticsearch tutorial shows you the process end-to-end.
Using the right query
editCrafting the right query is crucial for semantic search.
Which query you use and which field you target in your queries depends on your chosen workflow.
If you’re using the semantic_text
workflow it’s quite simple.
If not, it depends on which type of embeddings you’re working with.
Field type to query | Query to use | Notes |
---|---|---|
The |
||
The |
||
The |
If you want Elasticsearch to generate embeddings at both index and query time, use the semantic_text
field and the semantic
query.
If you want to bring your own embeddings, use the sparse_vector
or dense_vector
field type and the associated query depending on the NLP model you used to generate the embeddings.
For the easiest way to perform semantic search in the Elastic Stack, refer to the semantic_text
end-to-end tutorial.
Read more
edit-
Tutorials:
-
Interactive examples:
-
The
elasticsearch-labs
repo contains a number of interactive semantic search examples in the form of executable Python notebooks, using the Elasticsearch Python client - Semantic search with ELSER using the model deployment workflow
-
Semantic search with
semantic_text
-
The
-
Blogs: