Get an inference endpoint Added in 8.11.0

GET /_inference/{task_type}/{inference_id}

Path parameters

  • task_type string Required

    The task type

    Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

  • inference_id string Required

    The inference Id

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • endpoints array[object] Required
      Hide endpoints attributes Show endpoints attributes object
      • Hide attributes Show attributes object
        • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

        • overlap number

          The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

        • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

        • strategy string

          The chunking strategy: sentence or word.

      • service string Required

        The service type

      • service_settings object Required
      • inference_id string Required

        The inference Id

      • task_type string Required

        Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

GET /_inference/{task_type}/{inference_id}
curl \
 --request GET 'http://api.example.com/_inference/{task_type}/{inference_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
{
  "endpoints": [
    {
      "": {
        "max_chunk_size": 42.0,
        "overlap": 42.0,
        "sentence_overlap": 42.0,
        "strategy": "string"
      },
      "service": "string",
      "service_settings": {},
      "task_settings": {},
      "inference_id": "string",
      "task_type": "sparse_embedding"
    }
  ]
}