Inference processor
editInference processor
editUses a pre-trained data frame analytics model or a model deployed for natural language processing tasks to infer against the data that is being ingested in the pipeline.
Table 27. Inference Options
Name | Required | Default | Description |
---|---|---|---|
|
yes |
- |
(String) The ID or alias for the trained model, or the ID of the deployment. |
|
no |
- |
(List) Input fields for inference and output (destination) fields for the inference results. This option is incompatible with the |
|
no |
|
(String) Field added to incoming documents to contain results objects. |
|
no |
If defined the model’s default field map |
(Object) Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration. |
|
no |
The default settings defined in the model |
(Object) Contains the inference type and its options. |
|
no |
|
(Boolean) If |
|
no |
- |
Description of the processor. Useful for describing the purpose of the processor or its configuration. |
|
no |
- |
Conditionally execute the processor. See Conditionally run a processor. |
|
no |
|
Ignore failures for the processor. See Handling pipeline failures. |
|
no |
- |
Handle failures for the processor. See Handling pipeline failures. |
|
no |
- |
Identifier for the processor. Useful for debugging and metrics. |
-
You cannot use the
input_output
field with thetarget_field
andfield_map
fields. For NLP models, use theinput_output
option. For data frame analytics models, use thetarget_field
andfield_map
option. - Each inference input field must be single strings, not arrays of strings.
-
The
input_field
is processed as is and ignores any index mapping's analyzers at time of inference run.
Configuring input and output fields
editSelect the content
field for inference and write the result to
content_embedding
.
If the specified output_field
already exists in the ingest document, it won’t be overwritten.
The inference results will be appended to the existing fields within output_field
, which could lead to duplicate fields and potential errors.
To avoid this, use an unique output_field
field name that does not clash with any existing fields.
{ "inference": { "model_id": "model_deployment_for_inference", "input_output": [ { "input_field": "content", "output_field": "content_embedding" } ] } }
Configuring multiple inputs
editThe content
and title
fields will be read from the incoming document and
sent to the model for the inference. The inference output is written to
content_embedding
and title_embedding
respectively.
{ "inference": { "model_id": "model_deployment_for_inference", "input_output": [ { "input_field": "content", "output_field": "content_embedding" }, { "input_field": "title", "output_field": "title_embedding" } ] } }
Selecting the input fields with input_output
is incompatible with
the target_field
and field_map
options.
Data frame analytics models must use the target_field
to specify the root
location results are written to and optionally a field_map
to map field names
in the input document to the model input fields.
{ "inference": { "model_id": "model_deployment_for_inference", "target_field": "FlightDelayMin_prediction_infer", "field_map": { "your_field": "my_field" }, "inference_config": { "regression": {} } } }
Classification configuration options
editClassification configuration for inference.
-
num_top_classes
- (Optional, integer) Specifies the number of top class predictions to return. Defaults to 0.
-
num_top_feature_importance_values
- (Optional, integer) Specifies the maximum number of feature importance values per document. Defaults to 0 which means no feature importance calculation occurs.
-
results_field
-
(Optional, string)
The field that is added to incoming documents to contain the inference
prediction. Defaults to the
results_field
value of the data frame analytics job that was used to train the model, which defaults to<dependent_variable>_prediction
. -
top_classes_results_field
-
(Optional, string)
Specifies the field to which the top classes are written. Defaults to
top_classes
. -
prediction_field_type
-
(Optional, string)
Specifies the type of the predicted field to write.
Valid values are:
string
,number
,boolean
. Whenboolean
is provided1.0
is transformed totrue
and0.0
tofalse
.
Fill mask configuration options
edit-
num_top_classes
- (Optional, integer) Specifies the number of top class predictions to return. Defaults to 0.
-
results_field
-
(Optional, string)
The field that is added to incoming documents to contain the inference
prediction. Defaults to the
results_field
value of the data frame analytics job that was used to train the model, which defaults to<dependent_variable>_prediction
. -
tokenization
-
(Optional, object) Indicates the tokenization to perform and the desired settings. The default tokenization configuration is
bert
. Valid tokenization values are-
bert
: Use for BERT-style models -
mpnet
: Use for MPNet-style models -
roberta
: Use for RoBERTa-style and BART-style models -
[preview]
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
xlm_roberta
: Use for XLMRoBERTa-style models -
[preview]
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
bert_ja
: Use for BERT-style models trained for the Japanese language.
Properties of tokenization
-
bert
-
(Optional, object) BERT-style tokenization is to be performed with the enclosed settings.
Properties of bert
-
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
roberta
-
(Optional, object) RoBERTa-style tokenization is to be performed with the enclosed settings.
Properties of roberta
-
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
mpnet
-
(Optional, object) MPNet-style tokenization is to be performed with the enclosed settings.
Properties of mpnet
-
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
NER configuration options
edit-
results_field
-
(Optional, string)
The field that is added to incoming documents to contain the inference
prediction. Defaults to the
results_field
value of the data frame analytics job that was used to train the model, which defaults to<dependent_variable>_prediction
. -
tokenization
-
(Optional, object) Indicates the tokenization to perform and the desired settings. The default tokenization configuration is
bert
. Valid tokenization values are-
bert
: Use for BERT-style models -
mpnet
: Use for MPNet-style models -
roberta
: Use for RoBERTa-style and BART-style models -
[preview]
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
xlm_roberta
: Use for XLMRoBERTa-style models -
[preview]
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
bert_ja
: Use for BERT-style models trained for the Japanese language.
Properties of tokenization
-
bert
-
(Optional, object) BERT-style tokenization is to be performed with the enclosed settings.
Properties of bert
-
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
roberta
-
(Optional, object) RoBERTa-style tokenization is to be performed with the enclosed settings.
Properties of roberta
-
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
mpnet
-
(Optional, object) MPNet-style tokenization is to be performed with the enclosed settings.
Properties of mpnet
-
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
Regression configuration options
editRegression configuration for inference.
-
results_field
-
(Optional, string)
The field that is added to incoming documents to contain the inference
prediction. Defaults to the
results_field
value of the data frame analytics job that was used to train the model, which defaults to<dependent_variable>_prediction
. -
num_top_feature_importance_values
- (Optional, integer) Specifies the maximum number of feature importance values per document. By default, it is zero and no feature importance calculation occurs.
Text classification configuration options
edit-
classification_labels
- (Optional, string) An array of classification labels.
-
num_top_classes
- (Optional, integer) Specifies the number of top class predictions to return. Defaults to 0.
-
results_field
-
(Optional, string)
The field that is added to incoming documents to contain the inference
prediction. Defaults to the
results_field
value of the data frame analytics job that was used to train the model, which defaults to<dependent_variable>_prediction
. -
tokenization
-
(Optional, object) Indicates the tokenization to perform and the desired settings. The default tokenization configuration is
bert
. Valid tokenization values are-
bert
: Use for BERT-style models -
mpnet
: Use for MPNet-style models -
roberta
: Use for RoBERTa-style and BART-style models -
[preview]
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
xlm_roberta
: Use for XLMRoBERTa-style models -
[preview]
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
bert_ja
: Use for BERT-style models trained for the Japanese language.
Properties of tokenization
-
bert
-
(Optional, object) BERT-style tokenization is to be performed with the enclosed settings.
Properties of bert
-
span
-
(Optional, integer) When
truncate
isnone
, you can partition longer text sequences for inference. The value indicates how many tokens overlap between each subsequence.The default value is
-1
, indicating no windowing or spanning occurs.When your typical input is just slightly larger than
max_sequence_length
, it may be best to simply truncate; there will be very little information in the second subsequence. -
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
roberta
-
(Optional, object) RoBERTa-style tokenization is to be performed with the enclosed settings.
Properties of roberta
-
span
-
(Optional, integer) When
truncate
isnone
, you can partition longer text sequences for inference. The value indicates how many tokens overlap between each subsequence.The default value is
-1
, indicating no windowing or spanning occurs.When your typical input is just slightly larger than
max_sequence_length
, it may be best to simply truncate; there will be very little information in the second subsequence. -
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
mpnet
-
(Optional, object) MPNet-style tokenization is to be performed with the enclosed settings.
Properties of mpnet
-
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
Text embedding configuration options
edit-
results_field
-
(Optional, string)
The field that is added to incoming documents to contain the inference
prediction. Defaults to the
results_field
value of the data frame analytics job that was used to train the model, which defaults to<dependent_variable>_prediction
. -
tokenization
-
(Optional, object) Indicates the tokenization to perform and the desired settings. The default tokenization configuration is
bert
. Valid tokenization values are-
bert
: Use for BERT-style models -
mpnet
: Use for MPNet-style models -
roberta
: Use for RoBERTa-style and BART-style models -
[preview]
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
xlm_roberta
: Use for XLMRoBERTa-style models -
[preview]
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
bert_ja
: Use for BERT-style models trained for the Japanese language.
Properties of tokenization
-
bert
-
(Optional, object) BERT-style tokenization is to be performed with the enclosed settings.
Properties of bert
-
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
roberta
-
(Optional, object) RoBERTa-style tokenization is to be performed with the enclosed settings.
Properties of roberta
-
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
mpnet
-
(Optional, object) MPNet-style tokenization is to be performed with the enclosed settings.
Properties of mpnet
-
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
Text expansion configuration options
edit-
results_field
-
(Optional, string)
The field that is added to incoming documents to contain the inference
prediction. Defaults to the
results_field
value of the data frame analytics job that was used to train the model, which defaults to<dependent_variable>_prediction
. -
tokenization
-
(Optional, object) Indicates the tokenization to perform and the desired settings. The default tokenization configuration is
bert
. Valid tokenization values are-
bert
: Use for BERT-style models -
mpnet
: Use for MPNet-style models -
roberta
: Use for RoBERTa-style and BART-style models -
[preview]
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
xlm_roberta
: Use for XLMRoBERTa-style models -
[preview]
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
bert_ja
: Use for BERT-style models trained for the Japanese language.
Properties of tokenization
-
bert
-
(Optional, object) BERT-style tokenization is to be performed with the enclosed settings.
Properties of bert
-
span
-
(Optional, integer) When
truncate
isnone
, you can partition longer text sequences for inference. The value indicates how many tokens overlap between each subsequence.The default value is
-1
, indicating no windowing or spanning occurs.When your typical input is just slightly larger than
max_sequence_length
, it may be best to simply truncate; there will be very little information in the second subsequence. -
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
roberta
-
(Optional, object) RoBERTa-style tokenization is to be performed with the enclosed settings.
Properties of roberta
-
span
-
(Optional, integer) When
truncate
isnone
, you can partition longer text sequences for inference. The value indicates how many tokens overlap between each subsequence.The default value is
-1
, indicating no windowing or spanning occurs.When your typical input is just slightly larger than
max_sequence_length
, it may be best to simply truncate; there will be very little information in the second subsequence. -
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
mpnet
-
(Optional, object) MPNet-style tokenization is to be performed with the enclosed settings.
Properties of mpnet
-
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
Zero shot classification configuration options
edit-
labels
- (Optional, array) The labels to classify. Can be set at creation for default labels, and then updated during inference.
-
multi_label
-
(Optional, boolean)
Indicates if more than one
true
label is possible given the input. This is useful when labeling text that could pertain to more than one of the input labels. Defaults tofalse
. -
results_field
-
(Optional, string)
The field that is added to incoming documents to contain the inference
prediction. Defaults to the
results_field
value of the data frame analytics job that was used to train the model, which defaults to<dependent_variable>_prediction
. -
tokenization
-
(Optional, object) Indicates the tokenization to perform and the desired settings. The default tokenization configuration is
bert
. Valid tokenization values are-
bert
: Use for BERT-style models -
mpnet
: Use for MPNet-style models -
roberta
: Use for RoBERTa-style and BART-style models -
[preview]
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
xlm_roberta
: Use for XLMRoBERTa-style models -
[preview]
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
bert_ja
: Use for BERT-style models trained for the Japanese language.
Properties of tokenization
-
bert
-
(Optional, object) BERT-style tokenization is to be performed with the enclosed settings.
Properties of bert
-
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
roberta
-
(Optional, object) RoBERTa-style tokenization is to be performed with the enclosed settings.
Properties of roberta
-
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
mpnet
-
(Optional, object) MPNet-style tokenization is to be performed with the enclosed settings.
Properties of mpnet
-
truncate
-
(Optional, string) Indicates how tokens are truncated when they exceed
max_sequence_length
. The default value isfirst
.-
none
: No truncation occurs; the inference request receives an error. -
first
: Only the first sequence is truncated. -
second
: Only the second sequence is truncated. If there is just one sequence, that sequence is truncated.
-
For
zero_shot_classification
, the hypothesis sequence is always the second sequence. Therefore, do not usesecond
in this case. -
-
Inference processor examples
edit"inference":{ "model_id": "my_model_id", "field_map": { "original_fieldname": "expected_fieldname" }, "inference_config": { "regression": { "results_field": "my_regression" } } }
This configuration specifies a regression
inference and the results are
written to the my_regression
field contained in the target_field
results
object. The field_map
configuration maps the field original_fieldname
from
the source document to the field expected by the model.
"inference":{ "model_id":"my_model_id" "inference_config": { "classification": { "num_top_classes": 2, "results_field": "prediction", "top_classes_results_field": "probabilities" } } }
This configuration specifies a classification
inference. The number of
categories for which the predicted probabilities are reported is 2
(num_top_classes
). The result is written to the prediction
field and the top
classes to the probabilities
field. Both fields are contained in the
target_field
results object.
For an example that uses natural language processing trained models, refer to Add NLP inference to ingest pipelines.
Feature importance object mapping
editTo get the full benefit of aggregating and searching for feature importance, update your index mapping of the feature importance result field as you can see below:
"ml.inference.feature_importance": { "type": "nested", "dynamic": true, "properties": { "feature_name": { "type": "keyword" }, "importance": { "type": "double" } } }
The mapping field name for feature importance (in the example above, it is
ml.inference.feature_importance
) is compounded as follows:
<ml.inference.target_field>
.<inference.tag>
.feature_importance
-
<ml.inference.target_field>
: defaults toml.inference
. -
<inference.tag>
: if is not provided in the processor definition, then it is not part of the field path.
For example, if you provide a tag foo
in the definition as you can see below:
{ "tag": "foo", ... }
Then, the feature importance value is written to the
ml.inference.foo.feature_importance
field.
You can also specify the target field as follows:
{ "tag": "foo", "target_field": "my_field" }
In this case, feature importance is exposed in the
my_field.foo.feature_importance
field.