Validates an anomaly detection job | Elasticsearch API (v8)

Validates an anomaly detection job Added in 6.3.0

POST /_ml/anomaly_detectors/_validate

application/json

Body Required

job_id string
analysis_config object

Additional properties are allowed.
Hide analysis_config attributes Show analysis_config attributes object
- bucket_span string
  
  A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.
- categorization_analyzer string | object
  
  One of:
  _types:CategorizationAnalyzer string _types:CategorizationAnalyzerDefinition object
  
  Hide attributes Show attributes
  
  char_filter array
  
  One or more character filters. In addition to the built-in character filters, other plugins can provide more character filters. If this property is not specified, no character filters are applied prior to categorization. If you are customizing some other aspect of the analyzer and you need to achieve the equivalent of categorization_filters (which are not permitted when some other aspect of the analyzer is customized), add them here as pattern replace character filters.
  
  External documentation
  
  filter array
  
  One or more token filters. In addition to the built-in token filters, other plugins can provide more token filters. If this property is not specified, no token filters are applied prior to categorization.
  
  External documentation
  
  tokenizer object | string
  
  The name or definition of the tokenizer to use after character filters are applied. This property is compulsory if categorization_analyzer is specified as an object. Machine learning provides a tokenizer called ml_standard that tokenizes in a way that has been determined to produce good categorization results on a variety of log file formats for logs in English. If you want to use that tokenizer but change the character or token filters, specify "tokenizer": "ml_standard" in your categorization_analyzer. Additionally, the ml_classic tokenizer is available, which tokenizes in the same way as the non-customizable tokenizer in old versions of the product (before 6.2). ml_classic was the default categorization tokenizer in versions 6.2 to 7.13, so if you need categorization identical to the default for jobs created in these versions, specify "tokenizer": "ml_classic" in your categorization_analyzer.
  
  One of:
  object-1 object string-2 string
  
  Additional properties are allowed.
  
  Tokenizer reference
  
  Tokenizer reference
- categorization_field_name string
  
  Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
- categorization_filters array[string]
  
  If categorization_field_name is specified, you can also define optional filters. This property expects an array of regular expressions. The expressions are used to filter out matching sequences from the categorization field values. You can use this functionality to fine tune the categorization by excluding sequences from consideration when categories are defined. For example, you can exclude SQL statements that appear in your log files. This property cannot be used at the same time as categorization_analyzer. If you only want to define simple regular expression filters that are applied prior to tokenization, setting this property is the easiest method. If you also want to customize the tokenizer or post-tokenization filtering, use the categorization_analyzer property instead and include the filters as pattern_replace character filters. The effect is exactly the same.
- detectors array[object] Required
  
  Detector configuration objects specify which data fields a job analyzes. They also specify which analytical functions are used. You can specify multiple detectors for a job. If the detectors array does not contain at least one detector, no analysis can occur and an error is returned.
  Hide detectors attributes Show detectors attributes object
  
  by_field_name string
  
  Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
  
  custom_rules array[object]
  
  Custom rules enable you to customize the way detectors operate. For example, a rule may dictate conditions under which results should be skipped. Kibana refers to custom rules as job rules.
  
  Hide custom_rules attributes Show custom_rules attributes object
  
  actions array[string]
  
  The set of actions to be triggered when the rule applies. If more than one action is specified the effects of all actions are combined.
  
  Values are skip_result or skip_model_update.
  
  conditions array[object]
  
  An array of numeric conditions when the rule applies. A rule must either have a non-empty scope or at least one condition. Multiple conditions are combined together with a logical AND.
  
  Additional properties are allowed.
  
  scope object
  
  A scope of series where the rule applies. A rule must either have a non-empty scope or at least one condition. By default, the scope includes all series. Scoping is allowed for any of the fields that are also specified in by_field_name, over_field_name, or partition_field_name.
  
  Hide scope attribute Show scope attribute object
  
  * object Additional properties
  
  Additional properties are allowed.
  
  detector_description string
  
  A description of the detector.
  
  detector_index number
  
  A unique identifier for the detector. This identifier is based on the order of the detectors in the analysis_config, starting at zero. If you specify a value for this property, it is ignored.
  
  exclude_frequent string
  
  Values are all, none, by, or over.
  
  field_name string
  
  Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
  
  function string
  
  The analysis function that is used. For example, count, rare, mean, min, max, or sum.
  
  over_field_name string
  
  Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
  
  partition_field_name string
  
  Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
  
  use_null boolean
  
  Defines whether a new series is used as the null series when there is no value for the by or partition fields.
- influencers array[string]
  
  Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
- latency string
  
  A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.
- model_prune_window string
  
  A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.
- multivariate_by_fields boolean
  
  This functionality is reserved for internal use. It is not supported for use in customer environments and is not subject to the support SLA of official GA features. If set to true, the analysis will automatically find correlations between metrics for a given by field value and report anomalies when those correlations cease to hold. For example, suppose CPU and memory usage on host A is usually highly correlated with the same metrics on host B. Perhaps this correlation occurs because they are running a load-balanced application. If you enable this property, anomalies will be reported when, for example, CPU usage on host A is high and the value of CPU usage on host B is low. That is to say, you’ll see an anomaly when the CPU of host A is unusual given the CPU of host B. To use the multivariate_by_fields property, you must also specify by_field_name in your detector.
- per_partition_categorization object
  
  Additional properties are allowed.
  Hide per_partition_categorization attributes Show per_partition_categorization attributes object
  
  enabled boolean
  
  To enable this setting, you must also set the partition_field_name property to the same value in every detector that uses the keyword mlcategory. Otherwise, job creation fails.
  
  stop_on_warn boolean
  
  This setting can be set to true only if per-partition categorization is enabled. If true, both categorization and subsequent anomaly detection stops for partitions where the categorization status changes to warn. This setting makes it viable to have a job where it is expected that categorization works well for some partitions but not others; you do not pay the cost of bad categorization forever in the partitions where it works badly.
- summary_count_field_name string
  
  Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
analysis_limits object

Additional properties are allowed.
Hide analysis_limits attributes Show analysis_limits attributes object
- categorization_examples_limit number
  
  The maximum number of examples stored per category in memory and in the results data store. If you increase this value, more examples are available, however it requires that you have more storage available. If you set this value to 0, no examples are stored. NOTE: The categorization_examples_limit applies only to analysis that uses categorization.
- model_memory_limit number | string
  
  One of:
  _types:ByteSize number _types:ByteSize string
data_description object

Additional properties are allowed.
Hide data_description attributes Show data_description attributes object
- format string
  
  Only JSON format is supported at this time.
- time_field string
  
  Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
- time_format string
  
  The time format, which can be epoch, epoch_ms, or a custom pattern. The value epoch refers to UNIX or Epoch time (the number of seconds since 1 Jan 1970). The value epoch_ms indicates that time is measured in milliseconds since the epoch. The epoch and epoch_ms time formats accept either integer or real values. Custom patterns must conform to the Java DateTimeFormatter class. When you use date-time formatting patterns, it is recommended that you provide the full date, time and time zone. For example: yyyy-MM-dd'T'HH:mm:ssX. If the pattern that you specify is not sufficient to produce a complete timestamp, job creation fails.
- field_delimiter string
description string
model_plot object

Additional properties are allowed.
Hide model_plot attributes Show model_plot attributes object
- annotations_enabled boolean
  
  If true, enables calculation and storage of the model change annotations for each entity that is being analyzed.
- enabled boolean
  
  If true, enables calculation and storage of the model bounds for each entity that is being analyzed.
- terms string
  
  Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
model_snapshot_id string
model_snapshot_retention_days number
results_index_name string

Responses

200 application/json
Hide response attribute Show response attribute object
- acknowledged boolean Required
  
  For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_ml/anomaly_detectors/_validate

curl \
 -X POST http://api.example.com/_ml/anomaly_detectors/_validate \
 -H "Content-Type: application/json" \
 -d '{"job_id":"string","analysis_config":{"bucket_span":"string","":"string","categorization_field_name":"string","categorization_filters":["string"],"detectors":[{"by_field_name":"string","custom_rules":[{"actions":["skip_result"],"conditions":[{}],"scope":{"additionalProperty1":{},"additionalProperty2":{}}}],"detector_description":"string","detector_index":42.0,"exclude_frequent":"all","field_name":"string","function":"string","over_field_name":"string","partition_field_name":"string","use_null":true}],"influencers":["string"],"latency":"string","model_prune_window":"string","multivariate_by_fields":true,"per_partition_categorization":{"enabled":true,"stop_on_warn":true},"summary_count_field_name":"string"},"analysis_limits":{"categorization_examples_limit":42.0,"":42.0},"data_description":{"format":"string","time_field":"string","time_format":"string","field_delimiter":"string"},"description":"string","model_plot":{"annotations_enabled":true,"enabled":true,"terms":"string"},"model_snapshot_id":"string","model_snapshot_retention_days":42.0,"results_index_name":"string"}'

Request examples

{
  "job_id": "string",
  "analysis_config": {
    "bucket_span": "string",
    "": "string",
    "categorization_field_name": "string",
    "categorization_filters": [
      "string"
    ],
    "detectors": [
      {
        "by_field_name": "string",
        "custom_rules": [
          {
            "actions": [
              "skip_result"
            ],
            "conditions": [
              {}
            ],
            "scope": {
              "additionalProperty1": {},
              "additionalProperty2": {}
            }
          }
        ],
        "detector_description": "string",
        "detector_index": 42.0,
        "exclude_frequent": "all",
        "field_name": "string",
        "function": "string",
        "over_field_name": "string",
        "partition_field_name": "string",
        "use_null": true
      }
    ],
    "influencers": [
      "string"
    ],
    "latency": "string",
    "model_prune_window": "string",
    "multivariate_by_fields": true,
    "per_partition_categorization": {
      "enabled": true,
      "stop_on_warn": true
    },
    "summary_count_field_name": "string"
  },
  "analysis_limits": {
    "categorization_examples_limit": 42.0,
    "": 42.0
  },
  "data_description": {
    "format": "string",
    "time_field": "string",
    "time_format": "string",
    "field_delimiter": "string"
  },
  "description": "string",
  "model_plot": {
    "annotations_enabled": true,
    "enabled": true,
    "terms": "string"
  },
  "model_snapshot_id": "string",
  "model_snapshot_retention_days": 42.0,
  "results_index_name": "string"
}

Response examples (200)

{
  "acknowledged": true
}

Validates an anomaly detection job Added in 6.3.0

Body Required

categorization_analyzer string | object

model_memory_limit number | string

Responses