Perform inference on the service using the Unified Schema Added in 8.18.0

POST /_inference/{task_type}/{inference_id}/_unified

Path parameters

  • task_type string Required

    The task type

    Values are sparse_embedding, text_embedding, rerank, or completion.

  • inference_id string Required

    The inference Id

Query parameters

  • timeout string

    Specifies the amount of time to wait for the inference request to complete.

application/json

Body

  • messages array[object] Required

    A list of objects representing the conversation.

    Hide messages attributes Show messages attributes object
  • model string

    The ID of the model to use.

  • The upper bound limit for the number of tokens that can be generated for a completion request.

  • stop array[string]

    A sequence of strings to control when the model should stop generating additional tokens.

  • The sampling temperature to use.

  • tool_choice string | object

    One of:
  • tools array[object]

    A list of tools that the model can call.

    Hide tools attributes Show tools attributes object
    • type string Required

      The type of tool.

    • function object Required

      Additional properties are allowed.

      Hide function attributes Show function attributes object
      • A description of what the function does. This is used by the model to choose when and how to call the function.

      • name string Required

        The name of the function.

      • The parameters the functional accepts. This should be formatted as a JSON object.

        Additional properties are allowed.

      • strict boolean

        Whether to enable schema adherence when generating the function call.

  • top_p number

    Nucleus sampling, an alternative to sampling with temperature.

Responses

  • 200 application/json

    Additional properties are allowed.

POST /_inference/{task_type}/{inference_id}/_unified
curl \
 --request POST http://api.example.com/_inference/{task_type}/{inference_id}/_unified \
 --header "Content-Type: application/json" \
 --data '{"messages":[{"":"string","role":"string","tool_call_id":"string","tool_calls":[{"id":"string","function":{"arguments":"string","name":"string"},"type":"string"}]}],"model":"string","max_completion_tokens":42.0,"stop":["string"],"temperature":42.0,"":"string","tools":[{"type":"string","function":{"description":"string","name":"string","parameters":{},"strict":true}}],"top_p":42.0}'
Request examples
{
  "messages": [
    {
      "": "string",
      "role": "string",
      "tool_call_id": "string",
      "tool_calls": [
        {
          "id": "string",
          "function": {
            "arguments": "string",
            "name": "string"
          },
          "type": "string"
        }
      ]
    }
  ],
  "model": "string",
  "max_completion_tokens": 42.0,
  "stop": [
    "string"
  ],
  "temperature": 42.0,
  "": "string",
  "tools": [
    {
      "type": "string",
      "function": {
        "description": "string",
        "name": "string",
        "parameters": {},
        "strict": true
      }
    }
  ],
  "top_p": 42.0
}
Response examples (200)
{}