Import the trained model and vocabulary
editImport the trained model and vocabulary
editIf you want to install a trained model in a restricted or closed network, refer to these instructions.
After you choose a model, you must import it and its tokenizer vocabulary to your cluster. When you import the model, it must be chunked and imported one chunk at a time for storage in parts due to its size.
Trained models must be in a TorchScript representation for use with Elastic Stack machine learning features.
Eland is an Elasticsearch Python client that provides a simple script to perform the conversion of Hugging Face transformer models to their TorchScript representations, the chunking process, and upload to Elasticsearch; it is therefore the recommended import method. You can either install the Python Eland client on your machine or use a Docker image to build Eland and run the model import script.
Import with the Eland client installed
edit-
Install the Eland Python client with PyTorch extra dependencies.
python -m pip install 'eland[pytorch]'
-
Run the
eland_import_hub_model
script to download the model from Hugging Face, convert it to TorchScript format, and upload to the Elasticsearch cluster. For example:eland_import_hub_model --cloud-id <cloud-id> \ -u <username> -p <password> \ --hub-model-id elastic/distilbert-base-cased-finetuned-conll03-english \ --task-type ner \
Specify the Elastic Cloud identifier. Alternatively, use
--url
.Provide authentication details to access your cluster. Refer to Authentication methods to learn more.
Specify the identifier for the model in the Hugging Face model hub.
Specify the type of NLP task. Supported values are
fill_mask
,ner
,text_classification
,text_embedding
, andzero_shot_classification
.
For more details, refer to https://www.elastic.co/guide/en/elasticsearch/client/eland/current/machine-learning.html#ml-nlp-pytorch.
Import with Docker
editTo use the Docker container, you need to clone the Eland repository: https://github.com/elastic/eland
If you want to use Eland without installing it, clone the Eland repository and from the root directory run the following to build the Docker image:
$ docker build -t elastic/eland .
You can now use the container interactively:
$ docker run -it --rm --network host elastic/eland
The eland_import_hub_model
script can be run directly in the docker command:
docker run -it --rm docker.elastic.co/eland/eland \ eland_import_hub_model \ --url $ELASTICSEARCH_URL \ --hub-model-id elastic/distilbert-base-uncased-finetuned-conll03-english \ --start
Replace the $ELASTICSEARCH_URL
with the URL for your Elasticsearch cluster. Refer to
Authentication methods to learn more.
Authentication methods
editThe following authentication options are available when using the import script:
-
username/password authentication (specified with the
-u
and-p
options):eland_import_hub_model --url https://<hostname>:<port> -u <username> -p <password> ...
-
username/password authentication (embedded in the URL):
eland_import_hub_model --url https://<user>:<password>@<hostname>:<port> ...
-
API key authentication:
eland_import_hub_model --url https://<hostname>:<port> --es-api-key <api-key> ...