Elasticsearch query

edit

The Elasticsearch query rule type runs a user-configured query, compares the number of matches to a configured threshold, and schedules actions to run when the threshold condition is met.

Create the rule

edit

Fill in the rule details, then select Elasticsearch query.

Define the conditions

edit

Define properties to detect the condition.

Six clauses define the condition to detect
Index
Specifies an index or data view and a time field that is used for the time window.
Size
Specifies the number of documents to pass to the configured actions when the threshold condition is met.
Elasticsearch query
Specifies the ES DSL query. The number of documents that match this query is evaluated against the threshold condition. Only the query, fields, _source and runtime_mappings fields are used, other DSL fields are not considered.
Threshold
Defines a threshold value and a comparison operator (is above, is above or equals, is below, is below or equals, or is between). The number of documents that match the specified query is compared to this threshold.
Time window
Defines how far back to search for documents, using the time field set in the index clause. Generally this value should be set to a value higher than the check every value in the general rule details, to avoid gaps in detection.
Exclude matches from previous run
Turn on to avoid alert duplication by excluding documents that have already been detected by the previous rule run.

Add action variables

edit

Add an action to run when the rule condition is met. The following variables are specific to the Elasticsearch query rule. You can also specify variables common to all rules.

context.title
A preconstructed title for the rule. Example: rule term match alert query matched.
context.message
A preconstructed message for the rule. Example:
rule 'my es-query' is active:
- Value: 2
- Conditions Met: Number of matching documents is greater than 1 over 5m
- Timestamp: 2022-02-03T20:29:27.732Z
context.group
The name of the action group associated with the condition. Example: query matched.
context.date
The date, in ISO format, that the rule met the condition. Example: 2022-02-03T20:29:27.732Z.
context.value
The value of the rule that met the condition.
context.conditions
A description of the condition. Example: count greater than 4.
context.hits

The most recent documents that matched the query. Using the Mustache template array syntax, you can iterate over these hits to get values from the ES documents into your actions.

Iterate over hits using Mustache template syntax

The documents returned by context.hits include the _source field. If the Elasticsearch query search API’s fields parameter is used, documents will also return the fields field, which can be used to access any runtime fields defined by the runtime_mappings parameter as the following example shows:

{{#context.hits}}
timestamp: {{_source.@timestamp}}
day of the week: {{fields.day_of_week}} 
{{/context.hits}}

The fields parameter here is used to access the day_of_week runtime field.

As the fields response always returns an array of values for each field, the Mustache template array syntax is used to iterate over these values in your actions as the following example shows:

{{#context.hits}}
Labels:
{{#fields.labels}}
- {{.}}
{{/fields.labels}}
{{/context.hits}}

Test your query

edit

Use the Test query feature to verify that your query DSL is valid.

  • Valid queries are run against the configured index using the configured time window. The number of documents that match the query is displayed.

    Test Elasticsearch query returns number of matches when valid
  • An error message is shown if the query is invalid.

    Test Elasticsearch query shows error when invalid

Handling multiple matches of the same document

edit

By default, Exclude matches from previous run is turned on and the rule checks for duplication of document matches across multiple runs. If you configure the rule with a schedule interval smaller than the time window and a document matches a query in multiple runs, it is alerted on only once.

The rule uses the timestamp of the matches to avoid alerting on the same match multiple times. The timestamp of the latest match is used for evaluating the rule conditions when the rule runs. Only matches between the latest timestamp from the previous run and the current run are considered.

Suppose you have a rule configured to run every minute. The rule uses a time window of 1 hour and checks if there are more than 99 matches for the query. The Elasticsearch query rule type does the following:

Run 1 (0:00)

Rule finds 113 matches in the last hour: 113 > 99

Rule is active and user is alerted.

Run 2 (0:01)

Rule finds 127 matches in the last hour. 105 of the matches are duplicates that were already alerted on previously, so you actually have 22 matches: 22 !> 99

No alert.

Run 3 (0:02)

Rule finds 159 matches in the last hour. 88 of the matches are duplicates that were already alerted on previously, so you actually have 71 matches: 71 !> 99

No alert.

Run 4 (0:03)

Rule finds 190 matches in the last hour. 71 of them are duplicates that were already alerted on previously, so you actually have 119 matches: 119 > 99

Rule is active and user is alerted.

Known issues

edit

There is a known issue in 8.5 and 8.6 that results in corruption of the rule definition when you update API keys or add or remove snooze schedules in Stack Management > Rules. In particular, this bug affects Elasticsearch query rules with the KQL or Lucene query type and tracking containment rules. As a result of this bug, an "Unable to load rules" error occurs in Rules.

The long-term solution is to migrate to the latest release; 8.7 and later releases contain the fix for this bug. If you encounter this bug in 8.6, you can recover access to your rules in Kibana by using APIs to delete and recreate them:

  1. Find the affected rules. For example, run the following query in Dev Tools:

    GET .kibana*/_search
    {
      "query": {
        "bool": {
          "filter": [
            {
              "terms": {
                "alert.alertTypeId": [
                  ".es-query",
                  ".geo-containment"
                ]
              }
            }
          ],
          "must_not": {
            "exists": {
              "field": "references"
            }
          }
        }
      }
    }
  2. Make a copy of the query output, since you will use it to recreate the rules.
  3. Delete the affected rules. For example, run the following query in Dev Tools, replacing <rule_id> with the appropriate rule identifiers:

    DELETE kbn:/api/alerting/rule/<rule_id>
  4. Recreate the rules. For example, use Stack Management > Rules or the create rule API with the property values obtained from your query output.

If you update the API keys or add or remove snooze schedules again, the problem will re-occur until you upgrade to a release that contains the fix.