Create a trained model | Elasticsearch API documentation

Create a trained model Generally available; Added in 7.10.0

PUT /_ml/trained_models/{model_id}

Api key auth Basic auth Bearer auth

Enable you to supply a trained model that is not created by data frame analytics.

Required authorization

Cluster privileges: manage_ml

Path parameters

model_id string Required

The unique identifier of the trained model.

Query parameters

defer_definition_decompression boolean Generally available; Added in 8.0.0

If set to true and a compressed_definition is provided, the request defers definition decompression and skips relevant validations.
wait_for_completion boolean Generally available; Added in 8.8.0

Whether to wait for all child operations (e.g. model download) to complete.

application/json

Body Required

compressed_definition string

The compressed (GZipped and Base64 encoded) inference definition of the model. If compressed_definition is specified, then definition cannot be specified.
definition object

The inference definition for the model. If definition is specified, then compressed_definition cannot be specified.
Hide definition attributes Show definition attributes object
- preprocessors array[object]
  
  Collection of preprocessors
  Hide preprocessors attributes Show preprocessors attributes object
  
  frequency_encoding object
  
  Hide frequency_encoding attributes Show frequency_encoding attributes object
  
  field string Required
  
  feature_name string Required
  
  frequency_map object Required
  
  one_hot_encoding object
  
  Hide one_hot_encoding attributes Show one_hot_encoding attributes object
  
  field string Required
  
  hot_map object Required
  
  target_mean_encoding object
  
  Hide target_mean_encoding attributes Show target_mean_encoding attributes object
  
  field string Required
  
  feature_name string Required
  
  target_map object Required
  
  default_value number Required
- trained_model object Required
  
  The definition of the trained model.
  Hide trained_model attributes Show trained_model attributes object
  
  tree object
  
  The definition for a binary decision tree.
  
  Hide tree attributes Show tree attributes object
  
  classification_labels array[string]
  
  feature_names array[string] Required
  
  target_type string
  
  tree_structure array[object] Required
  
  tree_node object
  
  The definition of a node in a tree. There are two major types of nodes: leaf nodes and not-leaf nodes.
  
  Leaf nodes only need node_index and leaf_value defined.
  
  All other nodes need split_feature, left_child, right_child, threshold, decision_type, and default_left defined.
  
  Hide tree_node attributes Show tree_node attributes object
  
  decision_type string
  
  default_left boolean
  
  leaf_value number
  
  left_child number
  
  node_index number Required
  
  right_child number
  
  split_feature number
  
  split_gain number
  
  threshold number
  
  ensemble object
  
  The definition for an ensemble model
  
  Hide ensemble attributes Show ensemble attributes object
  
  classification_labels array[string]
  
  feature_names array[string]
  
  target_type string
  
  trained_models array[object] Required
description string

A human-readable description of the inference trained model.
inference_config object

The default configuration for inference. This can be either a regression or classification configuration. It must match the underlying definition.trained_model's target_type. For pre-packaged models such as ELSER the config is not required.
Hide inference_config attributes Show inference_config attributes object
- regression object
  
  Regression configuration for inference.
  Hide regression attributes Show regression attributes object
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  num_top_feature_importance_values number
  
  Specifies the maximum number of feature importance values per document.
  
  Default value is 0.
- classification object
  
  Classification configuration for inference.
  Hide classification attributes Show classification attributes object
  
  num_top_classes number
  
  Specifies the number of top class predictions to return. Defaults to 0.
  
  num_top_feature_importance_values number
  
  Specifies the maximum number of feature importance values per document.
  
  Default value is 0.
  
  prediction_field_type string
  
  Specifies the type of the predicted field to write. Acceptable values are: string, number, boolean. When boolean is provided 1.0 is transformed to true and 0.0 to false.
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  top_classes_results_field string
  
  Specifies the field to which the top classes are written. Defaults to top_classes.
- text_classification object Generally available; Added in 8.0.0
  
  Text classification configuration for inference.
  Hide text_classification attributes Show text_classification attributes object
  
  num_top_classes number
  
  Specifies the number of top class predictions to return. Defaults to 0.
  
  tokenization object
  
  The tokenization options
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  classification_labels array[string]
  
  Classification labels to apply other than the stored labels. Must have the same deminsions as the default configured labels
  
  vocabulary object
- zero_shot_classification object Generally available; Added in 8.0.0
  
  Zeroshot classification configuration for inference.
  Hide zero_shot_classification attributes Show zero_shot_classification attributes object
  
  tokenization object
  
  The tokenization options to update when inferring
  
  hypothesis_template string
  
  Hypothesis template used when tokenizing labels for prediction
  
  Default value is "This example is {}.".
  
  classification_labels array[string] Required
  
  The zero shot classification labels indicating entailment, neutral, and contradiction Must contain exactly and only entailment, neutral, and contradiction
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  multi_label boolean
  
  Indicates if more than one true label exists.
  
  Default value is false.
  
  labels array[string]
  
  The labels to predict.
- fill_mask object Generally available; Added in 8.0.0
  
  Fill mask configuration for inference.
  Hide fill_mask attributes Show fill_mask attributes object
  
  mask_token string
  
  The string/token which will be removed from incoming documents and replaced with the inference prediction(s). In a response, this field contains the mask token for the specified model/tokenizer. Each model and tokenizer has a predefined mask token which cannot be changed. Thus, it is recommended not to set this value in requests. However, if this field is present in a request, its value must match the predefined value for that model/tokenizer, otherwise the request will fail.
  
  num_top_classes number
  
  Specifies the number of top class predictions to return. Defaults to 0.
  
  tokenization object
  
  The tokenization options to update when inferring
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  vocabulary object
- learning_to_rank object
  Hide learning_to_rank attributes Show learning_to_rank attributes object
  
  default_params object
  
  Hide default_params attribute Show default_params attribute object
  
  * object Additional properties
  
  feature_extractors array[object]
  
  num_top_feature_importance_values number Required
- ner object Generally available; Added in 8.0.0
  
  Named entity recognition configuration for inference.
  Hide ner attributes Show ner attributes object
  
  tokenization object
  
  The tokenization options
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  classification_labels array[string]
  
  The token classification labels. Must be IOB formatted tags
  
  vocabulary object
- pass_through object Generally available; Added in 8.0.0
  
  Pass through configuration for inference.
  Hide pass_through attributes Show pass_through attributes object
  
  tokenization object
  
  The tokenization options
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  vocabulary object
- text_embedding object Generally available; Added in 8.0.0
  
  Text embedding configuration for inference.
  Hide text_embedding attributes Show text_embedding attributes object
  
  embedding_size number
  
  The number of dimensions in the embedding output
  
  tokenization object
  
  The tokenization options
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  vocabulary object
- text_expansion object Generally available; Added in 8.8.0
  
  Text expansion configuration for inference.
  Hide text_expansion attributes Show text_expansion attributes object
  
  tokenization object
  
  The tokenization options
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  vocabulary object
- question_answering object Generally available; Added in 8.3.0
  
  Question answering configuration for inference.
  Hide question_answering attributes Show question_answering attributes object
  
  num_top_classes number
  
  Specifies the number of top class predictions to return. Defaults to 0.
  
  tokenization object
  
  The tokenization options to update when inferring
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  max_answer_length number
  
  The maximum answer length to consider
input object

The input field names for the model definition.
Hide input attribute Show input attribute object
- field_names string | array[string] Required
metadata object

An object map that contains metadata about the model.
model_type string
The model type.

Supported values include:
- tree_ensemble: The model definition is an ensemble model of decision trees.
- lang_ident: A special type reserved for language identification models.
- pytorch: The stored definition is a PyTorch (specifically a TorchScript) model. Currently only NLP models are supported.
Values are tree_ensemble, lang_ident, or pytorch.
model_size_bytes number

The estimated memory usage in bytes to keep the trained model in memory. This property is supported only if defer_definition_decompression is true or the model definition is not supplied.
platform_architecture string

The platform architecture (if applicable) of the trained mode. If the model only works on one platform, because it is heavily optimized for a particular processor architecture and OS combination, then this field specifies which. The format of the string must match the platform identifiers used by Elasticsearch, so one of, linux-x86_64, linux-aarch64, darwin-x86_64, darwin-aarch64, or windows-x86_64. For portable models (those that work independent of processor architecture or OS features), leave this field unset.
tags array[string]

An array of tags to organize the model.
prefix_strings object Generally available; Added in 8.12.0

Optional prefix strings applied at inference
Hide prefix_strings attributes Show prefix_strings attributes object
- ingest string
  
  String prepended to input at ingest
- search string
  
  String prepended to input at search

Responses

200 application/json
Hide response attributes Show response attributes object
- model_id string Required
  
  Identifier for the trained model.
- model_type string
  
  The model type
  
  Supported values include:
  
  tree_ensemble: The model definition is an ensemble model of decision trees.
  
  lang_ident: A special type reserved for language identification models.
  
  pytorch: The stored definition is a PyTorch (specifically a TorchScript) model. Currently only NLP models are supported.
  
  Values are tree_ensemble, lang_ident, or pytorch.
- tags array[string] Required
  
  A comma delimited string of tags. A trained model can have many tags, or none.
- version string
  
  The Elasticsearch version number in which the trained model was created.
- compressed_definition string
- created_by string
  
  Information on the creator of the trained model.
- create_time string | number
  
  The time when the trained model was created.
  
  One of:
  string-1 string EpochTimeUnitMillis number
  
  Time unit for milliseconds
- default_field_map object
  
  Any field map described in the inference configuration takes precedence.
  
  Hide default_field_map attribute Show default_field_map attribute object
  
  * string Additional properties
- description string
  
  The free-text description of the trained model.
- estimated_heap_memory_usage_bytes number
  
  The estimated heap usage in bytes to keep the trained model in memory.
- estimated_operations number
  
  The estimated number of operations to use the trained model.
- fully_defined boolean
  
  True if the full model definition is present.
- inference_config object
  
  The default configuration for inference. This can be either a regression, classification, or one of the many NLP focused configurations. It must match the underlying definition.trained_model's target_type. For pre-packaged models such as ELSER the config is not required.
  
  Hide inference_config attributes Show inference_config attributes object
  
  regression object
  
  Regression configuration for inference.
  
  Hide regression attributes Show regression attributes object
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  num_top_feature_importance_values number
  
  Specifies the maximum number of feature importance values per document.
  
  Default value is 0.
  
  classification object
  
  Classification configuration for inference.
  
  Hide classification attributes Show classification attributes object
  
  num_top_classes number
  
  Specifies the number of top class predictions to return. Defaults to 0.
  
  num_top_feature_importance_values number
  
  Specifies the maximum number of feature importance values per document.
  
  Default value is 0.
  
  prediction_field_type string
  
  Specifies the type of the predicted field to write. Acceptable values are: string, number, boolean. When boolean is provided 1.0 is transformed to true and 0.0 to false.
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  top_classes_results_field string
  
  Specifies the field to which the top classes are written. Defaults to top_classes.
  
  text_classification object Generally available; Added in 8.0.0
  
  Text classification configuration for inference.
  
  Hide text_classification attributes Show text_classification attributes object
  
  num_top_classes number
  
  Specifies the number of top class predictions to return. Defaults to 0.
  
  tokenization object
  
  The tokenization options
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  classification_labels array[string]
  
  Classification labels to apply other than the stored labels. Must have the same deminsions as the default configured labels
  
  vocabulary object
  
  zero_shot_classification object Generally available; Added in 8.0.0
  
  Zeroshot classification configuration for inference.
  
  Hide zero_shot_classification attributes Show zero_shot_classification attributes object
  
  tokenization object
  
  The tokenization options to update when inferring
  
  hypothesis_template string
  
  Hypothesis template used when tokenizing labels for prediction
  
  Default value is "This example is {}.".
  
  classification_labels array[string] Required
  
  The zero shot classification labels indicating entailment, neutral, and contradiction Must contain exactly and only entailment, neutral, and contradiction
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  multi_label boolean
  
  Indicates if more than one true label exists.
  
  Default value is false.
  
  labels array[string]
  
  The labels to predict.
  
  fill_mask object Generally available; Added in 8.0.0
  
  Fill mask configuration for inference.
  
  Hide fill_mask attributes Show fill_mask attributes object
  
  mask_token string
  
  The string/token which will be removed from incoming documents and replaced with the inference prediction(s). In a response, this field contains the mask token for the specified model/tokenizer. Each model and tokenizer has a predefined mask token which cannot be changed. Thus, it is recommended not to set this value in requests. However, if this field is present in a request, its value must match the predefined value for that model/tokenizer, otherwise the request will fail.
  
  num_top_classes number
  
  Specifies the number of top class predictions to return. Defaults to 0.
  
  tokenization object
  
  The tokenization options to update when inferring
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  vocabulary object
  
  learning_to_rank object
  
  Hide learning_to_rank attributes Show learning_to_rank attributes object
  
  default_params object
  
  Hide default_params attribute Show default_params attribute object
  
  * object Additional properties
  
  feature_extractors array[object]
  
  num_top_feature_importance_values number Required
  
  ner object Generally available; Added in 8.0.0
  
  Named entity recognition configuration for inference.
  
  Hide ner attributes Show ner attributes object
  
  tokenization object
  
  The tokenization options
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  classification_labels array[string]
  
  The token classification labels. Must be IOB formatted tags
  
  vocabulary object
  
  pass_through object Generally available; Added in 8.0.0
  
  Pass through configuration for inference.
  
  Hide pass_through attributes Show pass_through attributes object
  
  tokenization object
  
  The tokenization options
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  vocabulary object
  
  text_embedding object Generally available; Added in 8.0.0
  
  Text embedding configuration for inference.
  
  Hide text_embedding attributes Show text_embedding attributes object
  
  embedding_size number
  
  The number of dimensions in the embedding output
  
  tokenization object
  
  The tokenization options
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  vocabulary object
  
  text_expansion object Generally available; Added in 8.8.0
  
  Text expansion configuration for inference.
  
  Hide text_expansion attributes Show text_expansion attributes object
  
  tokenization object
  
  The tokenization options
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  vocabulary object
  
  question_answering object Generally available; Added in 8.3.0
  
  Question answering configuration for inference.
  
  Hide question_answering attributes Show question_answering attributes object
  
  num_top_classes number
  
  Specifies the number of top class predictions to return. Defaults to 0.
  
  tokenization object
  
  The tokenization options to update when inferring
  
  results_field string
  
  The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
  
  max_answer_length number
  
  The maximum answer length to consider
- input object Required
  
  The input field names for the model definition.
  
  Hide input attribute Show input attribute object
  
  field_names array[string] Required
  
  An array of input field names for the model.
- license_level string
  
  The license level of the trained model.
- metadata object
  
  An object containing metadata about the trained model. For example, models created by data frame analytics contain analysis_config and input objects.
  
  Hide metadata attributes Show metadata attributes object
  
  model_aliases array[string]
  
  feature_importance_baseline object
  
  An object that contains the baseline for feature importance values. For regression analysis, it is a single value. For classification analysis, there is a value for each class.
  
  Hide feature_importance_baseline attribute Show feature_importance_baseline attribute object
  
  * string Additional properties
  
  hyperparameters array[object]
  
  List of the available hyperparameters optimized during the fine_parameter_tuning phase as well as specified by the user.
  
  Hide hyperparameters attributes Show hyperparameters attributes object
  
  absolute_importance number
  
  A positive number showing how much the parameter influences the variation of the loss function. For hyperparameters with values that are not specified by the user but tuned during hyperparameter optimization.
  
  name string Required
  
  Name of the hyperparameter.
  
  relative_importance number
  
  A number between 0 and 1 showing the proportion of influence on the variation of the loss function among all tuned hyperparameters. For hyperparameters with values that are not specified by the user but tuned during hyperparameter optimization.
  
  supplied boolean Required
  
  Indicates if the hyperparameter is specified by the user (true) or optimized (false).
  
  value number Required
  
  The value of the hyperparameter, either optimized or specified by the user.
  
  total_feature_importance array[object]
  
  An array of the total feature importance for each feature used from the training data set. This array of objects is returned if data frame analytics trained the model and the request includes total_feature_importance in the include request parameter.
  
  Hide total_feature_importance attributes Show total_feature_importance attributes object
  
  feature_name string Required
  
  The feature for which this importance was calculated.
  
  importance array[object] Required
  
  A collection of feature importance statistics related to the training data set for this particular feature.
  
  classes array[object] Required
  
  If the trained model is a classification model, feature importance statistics are gathered per target class value.
- model_size_bytes number | string
  
  One of:
  number-1 number string-2 string
- model_package object
  
  Hide model_package attributes Show model_package attributes object
  
  create_time number
  
  Time unit for milliseconds
  
  description string
  
  inference_config object
  
  Hide inference_config attribute Show inference_config attribute object
  
  * object Additional properties
  
  metadata object
  
  Hide metadata attribute Show metadata attribute object
  
  * object Additional properties
  
  minimum_version string
  
  model_repository string
  
  model_type string
  
  packaged_model_id string Required
  
  platform_architecture string
  
  prefix_strings object
  
  Hide prefix_strings attributes Show prefix_strings attributes object
  
  ingest string
  
  String prepended to input at ingest
  
  search string
  
  String prepended to input at search
  
  size number | string
  
  One of:
  number-1 number string-2 string
  
  sha256 string
  
  tags array[string]
  
  vocabulary_file string
- location object
  
  Hide location attribute Show location attribute object
  
  index object Required
  
  Hide index attribute Show index attribute object
  
  name string Required
- platform_architecture string
- prefix_strings object
  
  Hide prefix_strings attributes Show prefix_strings attributes object
  
  ingest string
  
  String prepended to input at ingest
  
  search string
  
  String prepended to input at search

PUT /_ml/trained_models/{model_id}

curl \
 --request PUT 'http://api.example.com/_ml/trained_models/{model_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"compressed_definition":"string","definition":{"preprocessors":[{"frequency_encoding":{"field":"string","feature_name":"string","frequency_map":{}},"one_hot_encoding":{"field":"string","hot_map":{}},"target_mean_encoding":{"field":"string","feature_name":"string","target_map":{},"default_value":42.0}}],"trained_model":{"tree":{"classification_labels":["string"],"feature_names":["string"],"target_type":"string","tree_structure":[{}]},"tree_node":{"decision_type":"string","default_left":true,"leaf_value":42.0,"left_child":42.0,"node_index":42.0,"right_child":42.0,"split_feature":42.0,"split_gain":42.0,"threshold":42.0},"ensemble":{"classification_labels":["string"],"feature_names":["string"],"target_type":"string","trained_models":[{}]}}},"description":"string","inference_config":{"regression":{"results_field":"string","num_top_feature_importance_values":0},"classification":{"num_top_classes":42.0,"num_top_feature_importance_values":0,"prediction_field_type":"string","results_field":"string","top_classes_results_field":"string"},"text_classification":{"num_top_classes":42.0,"tokenization":{},"results_field":"string","classification_labels":["string"],"vocabulary":{}},"zero_shot_classification":{"tokenization":{},"hypothesis_template":"\"This example is {}.\"","classification_labels":["string"],"results_field":"string","multi_label":false,"labels":["string"]},"fill_mask":{"mask_token":"string","num_top_classes":42.0,"tokenization":{},"results_field":"string","vocabulary":{}},"learning_to_rank":{"default_params":{"additionalProperty1":{},"additionalProperty2":{}},"feature_extractors":[{}],"num_top_feature_importance_values":42.0},"ner":{"tokenization":{},"results_field":"string","classification_labels":["string"],"vocabulary":{}},"pass_through":{"tokenization":{},"results_field":"string","vocabulary":{}},"text_embedding":{"embedding_size":42.0,"tokenization":{},"results_field":"string","vocabulary":{}},"text_expansion":{"tokenization":{},"results_field":"string","vocabulary":{}},"question_answering":{"num_top_classes":42.0,"tokenization":{},"results_field":"string","max_answer_length":42.0}},"input":{"field_names":"string"},"metadata":{},"model_type":"tree_ensemble","model_size_bytes":42.0,"platform_architecture":"string","tags":["string"],"prefix_strings":{"ingest":"string","search":"string"}}'

Create a trained model Generally available; Added in 7.10.0

Required authorization

Path parameters

Query parameters

Body Required

Responses

create_time string | number

model_size_bytes number | string

size number | string