Semantic queryedit

This functionality is in beta and is subject to change. The design and code is less mature than official GA features and is being provided as-is with no warranties. Beta features are not subject to the support SLA of official GA features.

The semantic query type enables you to perform semantic search on data stored in a semantic_text field.

Example requestedit

GET my-index-000001/_search
{
  "query": {
    "semantic": {
      "field": "inference_field",
      "query": "Best surfing places"
    }
  }
}

Top-level parameters for semanticedit

field
(Required, string) The semantic_text field to perform the query on.
query
(Required, string) The query text to be searched for on the field.

Refer to this tutorial to learn more about semantic search using semantic_text and semantic query.

Hybrid search with the semantic queryedit

The semantic query can be used as a part of a hybrid search where the semantic query is combined with lexical queries. For example, the query below finds documents with the title field matching "mountain lake", and combines them with results from a semantic search on the field title_semantic, that is a semantic_text field. The combined documents are then scored, and the top 3 top scored documents are returned.

POST my-index/_search
{
  "size" : 3,
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "title": {
              "query": "mountain lake",
              "boost": 1
            }
          }
        },
        {
          "semantic": {
            "field": "title_semantic",
            "query": "mountain lake",
            "boost": 2
          }
        }
      ]
    }
  }
}

You can also use semantic_text as part of Reciprocal Rank Fusion to make ranking relevant results easier:

GET my-index/_search
{
  "retriever": {
    "rrf": {
      "retrievers": [
        {
          "standard": {
            "query": {
              "term": {
                "text": "shoes"
              }
            }
          }
        },
        {
          "standard": {
            "query": {
              "semantic": {
                "field": "semantic_field",
                "query": "shoes"
              }
            }
          }
        }
      ],
      "rank_window_size": 50,
      "rank_constant": 20
    }
  }
}

Advanced search on semantic_text fieldsedit

The semantic query uses default settings for searching on semantic_text fields for ease of use. If you want to fine-tune a search on a semantic_text field, you need to know the task type used by the inference_id configured in semantic_text. You can find the task type using the Get inference API, and check the task_type associated with the inference service. Depending on the task_type, use either the sparse_vector or the knn query for greater flexibility and customization.

Search with sparse_embedding inferenceedit

When the inference endpoint uses a sparse_embedding model, you can use a sparse_vector query on a semantic_text field in the following way:

GET test-index/_search
{
  "query": {
    "nested": {
      "path": "inference_field.inference.chunks",
      "query": {
        "sparse_vector": {
          "field": "inference_field.inference.chunks.embeddings",
          "inference_id": "my-inference-id",
          "query": "mountain lake"
        }
      }
    }
  }
}

You can customize the sparse_vector query to include specific settings, like pruning configuration.

Search with text_embedding inferenceedit

When the inference endpoint uses a text_embedding model, you can use a knn query on a semantic_text field in the following way:

GET test-index/_search
{
  "query": {
    "nested": {
      "path": "inference_field.inference.chunks",
      "query": {
        "knn": {
          "field": "inference_field.inference.chunks.embeddings",
          "query_vector_builder": {
            "text_embedding": {
              "model_id": "my_inference_id",
	      "model_text": "mountain lake"
            }
          }
        }
      }
    }
  }
}

You can customize the knn query to include specific settings, like num_candidates and k.