Create inference API
editCreate inference API
editThis functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
Creates a model to perform an inference task.
Request
editPUT /_inference/<task_type>/<model_id>
Prerequisites
edit-
Requires the
manage
cluster privilege.
Description
editThe create inference API enables you to create and configure an inference model to perform a specific inference task.
Path parameters
edit-
<model_id>
- (Required, string) The unique identifier of the model.
-
<task_type>
-
(Required, string) The type of the inference task that the model will perform. Available task types:
-
sparse_embedding
, -
text_embedding
.
-
Request body
edit-
service
-
(Required, string) The type of service supported for the specified task type. Available services:
-
elser
-
-
service_settings
-
(Required, object)
Settings used to install the inference model. These settings are specific to the
service
you specified. -
task_settings
-
(Optional, object)
Settings to configure the inference task. These settings are specific to the
<task_type>
you specified.
Examples
editThe following example shows how to create an inference model called
my-elser-model
to perform a sparse_embedding
task type.
PUT _inference/sparse_embedding/my-elser-model { "service": "elser", "service_settings": { "num_allocations": 1, "num_threads": 1 }, "task_settings": {} }
Example response:
{ "model_id": "my-elser-model", "task_type": "sparse_embedding", "service": "elser", "service_settings": { "num_allocations": 1, "num_threads": 1 }, "task_settings": {} }