Create inference API

edit

This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

Creates a model to perform an inference task.

Request

edit

PUT /_inference/<task_type>/<model_id>

Prerequisites

edit

Description

edit

The create inference API enables you to create and configure an inference model to perform a specific inference task.

Path parameters

edit
<model_id>
(Required, string) The unique identifier of the model.
<task_type>

(Required, string) The type of the inference task that the model will perform. Available task types:

  • sparse_embedding,
  • text_embedding.

Request body

edit
service

(Required, string) The type of service supported for the specified task type. Available services:

  • elser
service_settings
(Required, object) Settings used to install the inference model. These settings are specific to the service you specified.
task_settings
(Optional, object) Settings to configure the inference task. These settings are specific to the <task_type> you specified.

Examples

edit

The following example shows how to create an inference model called my-elser-model to perform a sparse_embedding task type.

PUT _inference/sparse_embedding/my-elser-model
{
  "service": "elser",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1
  },
  "task_settings": {}
}

Example response:

{
  "model_id": "my-elser-model",
  "task_type": "sparse_embedding",
  "service": "elser",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1
  },
  "task_settings": {}
}