Create trained model vocabulary API

edit

Creates a trained model vocabulary. This is supported only for natural language processing (NLP) models.

This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

Request

edit

PUT _ml/trained_models/<model_id>/vocabulary/

Prerequisites

edit

Requires the manage_ml cluster privilege. This privilege is included in the machine_learning_admin built-in role.

Description

edit

The vocabulary is stored in the index as described in inference_config.*.vocabulary of the trained model definition.

Path parameters

edit
<model_id>
(Required, string) The unique identifier of the trained model.

Request body

edit
vocabulary
(array) The model vocabulary. Must not be empty.
merges
(Optional, array) The model merges used in byte-pair encoding. The merges must be sub-token pairs, space delimited, and in order of preference. Example: ["f o", "fo o"]. Must be provided for RoBERTa and BART style models.

Examples

edit

The following example shows how to create a model vocabulary for a previously stored trained model configuration.

PUT _ml/trained_models/elastic__distilbert-base-uncased-finetuned-conll03-english/vocabulary
{
  "vocabulary": [
    "[PAD]",
    "[unused0]",
    ...
  ]
}

The API returns the following results:

{
    "acknowledged": true
}