Search and compare text

edit

The Elastic Stack machine learning features can generate embeddings, which you can use to search in unstructured text or compare different pieces of text.

Text embedding

edit

Text embedding is a task which produces a mathematical representation of text called an embedding. The machine learning model turns the text into an array of numerical values (also known as a vector). Pieces of content with similar meaning have similar representations. This means it is possible to determine whether different pieces of text are either semantically similar, different, or even opposite by using a mathematical similarity function.

This task is responsible for producing only the embedding. When the embedding is created, it can be stored in a dense_vector field and used at search time. For example, you can use these vectors in a k-nearest neighbor (kNN) search to achieve semantic search capabilities.

The following is an example of producing a text embedding:

{
    docs: [{"text_field": "The quick brown fox jumps over the lazy dog."}]
}
...

The task returns the following result:

...
{
    "predicted_value": [0.293478, -0.23845, ..., 1.34589e2, 0.119376]
    ...
}
...