Hyperparameter optimization

edit

This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

When you create a data frame analytics job for classification or regression analysis, there are advanced configuration options known as hyperparameters. The ideal hyperparameter values vary from one data set to another. Therefore, by default the job calculates the best combination of values through a process of hyperparameter optimization.

Hyperparameter optimization involves multiple rounds of analysis. Each round involves a different combination of hyperparameter values, which are determined through a combination of random search and Bayesian optimization techniques. If you explicitly set a hyperparameter, that value is not optimized and remains the same in each round. To determine which round produces the best results, stratified K-fold cross-validation methods are used to split the data set, train a model, and calculate its performance on validation data.

You can view the hyperparameter values that were ultimately chosen by expanding the job details in Kibana or by using the get data frame analytics job stats API. You can also see the specific type of validation loss (such as mean squared error or binomial cross entropy) that was used to compare each round of optimization.

Unless you fully understand the purpose of a hyperparameter, it is highly recommended that you leave it unset and allow hyperparameter optimization to occur.