Validates an anomaly detection job Added in 6.3.0

POST /_ml/anomaly_detectors/_validate
application/json

Body Required

  • job_id string
  • Additional properties are allowed.

    Hide analysis_config attributes Show analysis_config attributes object
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • categorization_analyzer string | object

      One of:
    • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • If categorization_field_name is specified, you can also define optional filters. This property expects an array of regular expressions. The expressions are used to filter out matching sequences from the categorization field values. You can use this functionality to fine tune the categorization by excluding sequences from consideration when categories are defined. For example, you can exclude SQL statements that appear in your log files. This property cannot be used at the same time as categorization_analyzer. If you only want to define simple regular expression filters that are applied prior to tokenization, setting this property is the easiest method. If you also want to customize the tokenizer or post-tokenization filtering, use the categorization_analyzer property instead and include the filters as pattern_replace character filters. The effect is exactly the same.

    • detectors array[object] Required

      Detector configuration objects specify which data fields a job analyzes. They also specify which analytical functions are used. You can specify multiple detectors for a job. If the detectors array does not contain at least one detector, no analysis can occur and an error is returned.

      Hide detectors attributes Show detectors attributes object
      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • custom_rules array[object]

        Custom rules enable you to customize the way detectors operate. For example, a rule may dictate conditions under which results should be skipped. Kibana refers to custom rules as job rules.

        Hide custom_rules attributes Show custom_rules attributes object
        • actions array[string]

          The set of actions to be triggered when the rule applies. If more than one action is specified the effects of all actions are combined.

          Values are skip_result or skip_model_update.

        • conditions array[object]

          An array of numeric conditions when the rule applies. A rule must either have a non-empty scope or at least one condition. Multiple conditions are combined together with a logical AND.

          Additional properties are allowed.

        • scope object

          A scope of series where the rule applies. A rule must either have a non-empty scope or at least one condition. By default, the scope includes all series. Scoping is allowed for any of the fields that are also specified in by_field_name, over_field_name, or partition_field_name.

          Hide scope attribute Show scope attribute object
          • * object Additional properties

            Additional properties are allowed.

      • A description of the detector.

      • A unique identifier for the detector. This identifier is based on the order of the detectors in the analysis_config, starting at zero. If you specify a value for this property, it is ignored.

      • Values are all, none, by, or over.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • function string

        The analysis function that is used. For example, count, rare, mean, min, max, or sum.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • use_null boolean

        Defines whether a new series is used as the null series when there is no value for the by or partition fields.

    • influencers array[string]

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • latency string

      A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • This functionality is reserved for internal use. It is not supported for use in customer environments and is not subject to the support SLA of official GA features. If set to true, the analysis will automatically find correlations between metrics for a given by field value and report anomalies when those correlations cease to hold. For example, suppose CPU and memory usage on host A is usually highly correlated with the same metrics on host B. Perhaps this correlation occurs because they are running a load-balanced application. If you enable this property, anomalies will be reported when, for example, CPU usage on host A is high and the value of CPU usage on host B is low. That is to say, you’ll see an anomaly when the CPU of host A is unusual given the CPU of host B. To use the multivariate_by_fields property, you must also specify by_field_name in your detector.

    • Additional properties are allowed.

      Hide per_partition_categorization attributes Show per_partition_categorization attributes object
      • enabled boolean

        To enable this setting, you must also set the partition_field_name property to the same value in every detector that uses the keyword mlcategory. Otherwise, job creation fails.

      • This setting can be set to true only if per-partition categorization is enabled. If true, both categorization and subsequent anomaly detection stops for partitions where the categorization status changes to warn. This setting makes it viable to have a job where it is expected that categorization works well for some partitions but not others; you do not pay the cost of bad categorization forever in the partitions where it works badly.

    • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • Additional properties are allowed.

    Hide analysis_limits attributes Show analysis_limits attributes object
    • The maximum number of examples stored per category in memory and in the results data store. If you increase this value, more examples are available, however it requires that you have more storage available. If you set this value to 0, no examples are stored. NOTE: The categorization_examples_limit applies only to analysis that uses categorization.

    • The approximate maximum amount of memory resources that are required for analytical processing. Once this limit is approached, data pruning becomes more aggressive. Upon exceeding this limit, new entities are not modeled. If the xpack.ml.max_model_memory_limit setting has a value greater than 0 and less than 1024mb, that value is used instead of the default. The default value is relatively small to ensure that high resource usage is a conscious decision. If you have jobs that are expected to analyze high cardinality fields, you will likely need to use a higher value. If you specify a number instead of a string, the units are assumed to be MiB. Specifying a string is recommended for clarity. If you specify a byte size unit of b or kb and the number does not equate to a discrete number of megabytes, it is rounded down to the closest MiB. The minimum valid value is 1 MiB. If you specify a value less than 1 MiB, an error occurs. If you specify a value for the xpack.ml.max_model_memory_limit setting, an error occurs when you try to create jobs that have model_memory_limit values greater than that setting value.

  • Additional properties are allowed.

    Hide data_description attributes Show data_description attributes object
    • format string

      Only JSON format is supported at this time.

    • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • The time format, which can be epoch, epoch_ms, or a custom pattern. The value epoch refers to UNIX or Epoch time (the number of seconds since 1 Jan 1970). The value epoch_ms indicates that time is measured in milliseconds since the epoch. The epoch and epoch_ms time formats accept either integer or real values. Custom patterns must conform to the Java DateTimeFormatter class. When you use date-time formatting patterns, it is recommended that you provide the full date, time and time zone. For example: yyyy-MM-dd'T'HH:mm:ssX. If the pattern that you specify is not sufficient to produce a complete timestamp, job creation fails.

  • Additional properties are allowed.

    Hide model_plot attributes Show model_plot attributes object
    • If true, enables calculation and storage of the model change annotations for each entity that is being analyzed.

    • enabled boolean

      If true, enables calculation and storage of the model bounds for each entity that is being analyzed.

    • terms string

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_ml/anomaly_detectors/_validate
curl \
 -X POST http://api.example.com/_ml/anomaly_detectors/_validate \
 -H "Content-Type: application/json" \
 -d '{"job_id":"string","analysis_config":{"bucket_span":"string","":"string","categorization_field_name":"string","categorization_filters":["string"],"detectors":[{"by_field_name":"string","custom_rules":[{"actions":["skip_result"],"conditions":[{}],"scope":{"additionalProperty1":{},"additionalProperty2":{}}}],"detector_description":"string","detector_index":42.0,"exclude_frequent":"all","field_name":"string","function":"string","over_field_name":"string","partition_field_name":"string","use_null":true}],"influencers":["string"],"latency":"string","model_prune_window":"string","multivariate_by_fields":true,"per_partition_categorization":{"enabled":true,"stop_on_warn":true},"summary_count_field_name":"string"},"analysis_limits":{"categorization_examples_limit":42.0,"model_memory_limit":"string"},"data_description":{"format":"string","time_field":"string","time_format":"string","field_delimiter":"string"},"description":"string","model_plot":{"annotations_enabled":true,"enabled":true,"terms":"string"},"model_snapshot_id":"string","model_snapshot_retention_days":42.0,"results_index_name":"string"}'
Request examples
{
  "job_id": "string",
  "analysis_config": {
    "bucket_span": "string",
    "": "string",
    "categorization_field_name": "string",
    "categorization_filters": [
      "string"
    ],
    "detectors": [
      {
        "by_field_name": "string",
        "custom_rules": [
          {
            "actions": [
              "skip_result"
            ],
            "conditions": [
              {}
            ],
            "scope": {
              "additionalProperty1": {},
              "additionalProperty2": {}
            }
          }
        ],
        "detector_description": "string",
        "detector_index": 42.0,
        "exclude_frequent": "all",
        "field_name": "string",
        "function": "string",
        "over_field_name": "string",
        "partition_field_name": "string",
        "use_null": true
      }
    ],
    "influencers": [
      "string"
    ],
    "latency": "string",
    "model_prune_window": "string",
    "multivariate_by_fields": true,
    "per_partition_categorization": {
      "enabled": true,
      "stop_on_warn": true
    },
    "summary_count_field_name": "string"
  },
  "analysis_limits": {
    "categorization_examples_limit": 42.0,
    "model_memory_limit": "string"
  },
  "data_description": {
    "format": "string",
    "time_field": "string",
    "time_format": "string",
    "field_delimiter": "string"
  },
  "description": "string",
  "model_plot": {
    "annotations_enabled": true,
    "enabled": true,
    "terms": "string"
  },
  "model_snapshot_id": "string",
  "model_snapshot_retention_days": 42.0,
  "results_index_name": "string"
}
Response examples (200)
{
  "acknowledged": true
}