Get an inference endpoint Added in 8.11.0

GET /_inference/{inference_id}

Path parameters

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • endpoints array[object] Required
      Hide endpoints attributes Show endpoints attributes object
      • Hide attributes Show attributes object
        • Specifies the maximum size of a chunk in words This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy)

        • overlap number

          Specifies the number of overlapping words for chunks Only for word chunking strategy This value cannot be higher than the half of max_chunk_size

        • Specifies the number of overlapping sentences for chunks Only for sentence chunking strategy It can be either 1 or 0

        • strategy string

          Specifies the chunking strategy It could be either sentence or word

      • service string Required

        The service type

      • service_settings object Required
      • inference_id string Required

        The inference Id

      • task_type string Required

        Values are sparse_embedding, text_embedding, rerank, or completion.

GET /_inference/{inference_id}
curl \
 --request GET 'http://api.example.com/_inference/{inference_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
{
  "endpoints": [
    {
      "": {
        "max_chunk_size": 42.0,
        "overlap": 42.0,
        "sentence_overlap": 42.0,
        "strategy": "string"
      },
      "service": "string",
      "service_settings": {},
      "task_settings": {},
      "inference_id": "string",
      "task_type": "sparse_embedding"
    }
  ]
}