Evaluate data frame analytics API
editEvaluate data frame analytics API
editEvaluates the data frame analytics for an annotated index.
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
Request
editPOST _ml/data_frame/_evaluate
Prerequisites
edit-
You must have
monitor_ml
privilege to use this API. For more information, see Security privileges and Built-in roles.
Description
editThe API packages together commonly used evaluation metrics for various types of machine learning features. This has been designed for use on indexes created by data frame analytics. Evaluation requires both a ground truth field and an analytics result field to be present.
Request body
edit-
index
-
(Required, object) Defines the
index
in which the evaluation will be performed. -
query
- (Optional, object) A query clause that retrieves a subset of data from the source index. See Query DSL.
-
evaluation
-
(Required, object) Defines the type of evaluation you want to perform. See Data frame analytics evaluation resources.
Available evaluation types:
-
binary_soft_classification
-
regression
-
Examples
editBinary soft classification
editPOST _ml/data_frame/_evaluate { "index": "my_analytics_dest_index", "evaluation": { "binary_soft_classification": { "actual_field": "is_outlier", "predicted_probability_field": "ml.outlier_score" } } }
The API returns the following results:
{ "binary_soft_classification": { "auc_roc": { "score": 0.92584757746414444 }, "confusion_matrix": { "0.25": { "tp": 5, "fp": 9, "tn": 204, "fn": 5 }, "0.5": { "tp": 1, "fp": 5, "tn": 208, "fn": 9 }, "0.75": { "tp": 0, "fp": 4, "tn": 209, "fn": 10 } }, "precision": { "0.25": 0.35714285714285715, "0.5": 0.16666666666666666, "0.75": 0 }, "recall": { "0.25": 0.5, "0.5": 0.1, "0.75": 0 } } }
Regression
editPOST _ml/data_frame/_evaluate { "index": "house_price_predictions", "query": { "bool": { "filter": [ { "term": { "ml.is_training": false } } ] } }, "evaluation": { "regression": { "actual_field": "price", "predicted_field": "ml.price_prediction", "metrics": { "r_squared": {}, "mean_squared_error": {} } } } }
The output destination index from a data frame analytics regression analysis. |
|
In this example, a test/train split ( |
|
The ground truth value for the actual house price. This is required in order to evaluate results. |
|
The predicted value for house price calculated by the regression analysis. |
The following example calculates the training error:
POST _ml/data_frame/_evaluate { "index": "student_performance_mathematics_reg", "query": { "term": { "ml.is_training": { "value": true } } }, "evaluation": { "regression": { "actual_field": "G3", "predicted_field": "ml.G3_prediction", "metrics": { "r_squared": {}, "mean_squared_error": {} } } } }
In this example, a test/train split ( |
|
The field that contains the ground truth value for the actual student performance. This is required in order to evaluate results. |
|
The field that contains the predicted value for student performance calculated by the regression analysis. |
The next example calculates the testing error. The only difference compared with
the previous example is that ml.is_training
is set to false
this time, so
the query excludes the train split from the evaluation.
POST _ml/data_frame/_evaluate { "index": "student_performance_mathematics_reg", "query": { "term": { "ml.is_training": { "value": false } } }, "evaluation": { "regression": { "actual_field": "G3", "predicted_field": "ml.G3_prediction", "metrics": { "r_squared": {}, "mean_squared_error": {} } } } }
In this example, a test/train split ( |
|
The field that contains the ground truth value for the actual student performance. This is required in order to evaluate results. |
|
The field that contains the predicted value for student performance calculated by the regression analysis. |