Update a trained model deployment Added in 8.6.0

POST /_ml/trained_models/{model_id}/deployment/_update

Path parameters

  • model_id string Required

    The unique identifier of the trained model. Currently, only PyTorch models are supported.

Query parameters

  • The number of model allocations on each node where the model is deployed. All allocations on a node share the same copy of the model in memory but use a separate set of threads to evaluate the model. Increasing this value generally increases the throughput. If this setting is greater than the number of hardware threads it will automatically be changed to a value less than the number of hardware threads.

application/json

Body

  • The number of model allocations on each node where the model is deployed. All allocations on a node share the same copy of the model in memory but use a separate set of threads to evaluate the model. Increasing this value generally increases the throughput. If this setting is greater than the number of hardware threads it will automatically be changed to a value less than the number of hardware threads. If adaptive_allocations is enabled, do not set this value, because it’s automatically set.

  • Hide adaptive_allocations attributes Show adaptive_allocations attributes object
    • enabled boolean Required

      If true, adaptive_allocations is enabled

    • Specifies the minimum number of allocations to scale to. If set, it must be greater than or equal to 0. If not defined, the deployment scales to 0.

    • Specifies the maximum number of allocations to scale to. If set, it must be greater than or equal to min_number_of_allocations.

Responses

POST /_ml/trained_models/{model_id}/deployment/_update
curl \
 --request POST 'http://api.example.com/_ml/trained_models/{model_id}/deployment/_update' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"number_of_allocations":42.0,"adaptive_allocations":{"enabled":true,"min_number_of_allocations":42.0,"max_number_of_allocations":42.0}}'
Request examples
{
  "number_of_allocations": 42.0,
  "adaptive_allocations": {
    "enabled": true,
    "min_number_of_allocations": 42.0,
    "max_number_of_allocations": 42.0
  }
}
Response examples (200)
{
  "assignment": {
    "adaptive_allocations": {
      "enabled": true,
      "min_number_of_allocations": 42.0,
      "max_number_of_allocations": 42.0
    },
    "assignment_state": "started",
    "max_assigned_allocations": 42.0,
    "reason": "string",
    "routing_table": {
      "additionalProperty1": {
        "reason": "string",
        "routing_state": "failed",
        "current_allocations": 42.0,
        "target_allocations": 42.0
      },
      "additionalProperty2": {
        "reason": "string",
        "routing_state": "failed",
        "current_allocations": 42.0,
        "target_allocations": 42.0
      }
    },
    "": "string",
    "task_parameters": {
      "": 42.0,
      "model_id": "string",
      "deployment_id": "string",
      "number_of_allocations": 42.0,
      "priority": "normal",
      "queue_capacity": 42.0,
      "threads_per_allocation": 42.0
    }
  }
}