Create and manage rules

edit

Create and manage rules

edit

The Stack Management > Rules and Connectors UI provides a cross-app view of alerting. Different Kibana apps like Observability, Security, Maps and Machine Learning can offer their own rules. Rules and Connectors provides a central place to:

Example rule listing in Rules and Connectors

For more information on alerting concepts and the types of rules and connectors available, go to Alerting.

Required permissions

edit

Access to rules is granted based on your privileges to alerting features. For more information, go to Security.

Create and edit rules

edit

Many rules must be created within the context of a Kibana app like Metrics, APM, or Uptime, but others are generic. Generic rule types can be created in Rules and Connectors by clicking the Create rule button. This will launch a flyout that guides you through selecting a rule type and configuring its conditions and action type. For details on what types of rules are available and how to configure them, refer to Stack rules.

After a rule is created, you can open the action menu (…) and select Edit rule to re-open the flyout and change the rule properties.

General rule details

edit

All rules share the following four properties:

Name
The name of the rule. While this name does not have to be unique, a distinctive name can help you identify a rule.
Tags
A list of tag names that can be applied to a rule. Tags can help you organize and find rules.
Check every
Defines how often to evaluate the rule condition. Checks are queued; they run as close to the defined value as capacity allows. For more details, go to Alerting production considerations.
Notify

Defines how often alerts generate actions. Options include running actions at each check interval, only when the alert status changes, or at a custom action interval.

Since actions are triggered per alert, a rule can end up generating a large number of actions. Take the following example where a rule is monitoring three servers every minute for CPU usage > 0.9, and the rule is set to notify On check intervals:

  • Minute 1: server X123 > 0.9. One email is sent for server X123.
  • Minute 2: X123 and Y456 > 0.9. Two emails are sent, one for X123 and one for Y456.
  • Minute 3: X123, Y456, Z789 > 0.9. Three emails are sent, one for each of X123, Y456, Z789.

In this example, three emails are sent for server X123 in the span of 3 minutes for the same rule. Often, it’s desirable to suppress these re-notifications. If you set the rule notify setting to On custom action intervals with an interval of 5 minutes, you reduce noise by getting emails only every 5 minutes for servers that continue to exceed the threshold:

  • Minute 1: server X123 > 0.9. One email is sent for server X123.
  • Minute 2: X123 and Y456 > 0.9. One email is sent for Y456.
  • Minute 3: X123, Y456, Z789 > 0.9. One email is sent for Z789.

To get notified only once when a server exceeds the threshold, you can set the rule notify setting to On status changes.

All rules have name, tags, check every, and notify properties in common

Rule type and conditions

edit

Depending upon the Kibana app and context, you might be prompted to choose the type of rule to create. Some apps will preselect the type of rule for you.

Choosing the type of rule to create

Each rule type provides its own way of defining the conditions to detect, but an expression formed by a series of clauses is a common pattern. Each clause has a UI control that allows you to define the clause. For example, in an index threshold rule, the WHEN clause allows you to select an aggregation operation to apply to a numeric field.

UI for defining rule conditions on an index threshold rule

Action type and details

edit

To receive notifications when a rule meets the defined conditions, you must add one or more actions. Start by selecting a type of connector for your action:

UI for selecting an action type

Each action must specify a connector instance. If no connectors exist for the selected type, click Add connector to create one.

After you have selected a connector, use the Run When dropdown to choose the action group to associate with this action. When a rule meets the defined condition, it is marked as Active and alerts are created and assigned to an action group. In addition to the action groups defined by the selected rule type, each rule also has a Recovered action group that is assigned when a rule’s conditions are no longer detected.

Each action type exposes different properties. For example, an email action allows you to set the recipients, the subject, and a message body in markdown format. See Connectors for details on the types of actions provided by Kibana and their properties.

UI for defining an email action
Action variables
edit

Using the Mustache template syntax {{variable name}}, you can pass rule values at the time a condition is detected to an action. You can access the list of available variables using the "add rule variable" button. Although available variables differ by rule type, all rule types pass the following variables:

rule.id
The ID of the rule.
rule.name
The name of the rule.
rule.spaceId
The ID of the space for the rule.
rule.tags
The list of tags applied to the rule.
date
The date the rule scheduled the action, in ISO format.
alert.id
The ID of the alert that scheduled the action.
alert.actionGroup
The ID of the action group of the alert that scheduled the action.
alert.actionSubgroup
The action subgroup of the alert that scheduled the action.
alert.actionGroupName
The name of the action group of the alert that scheduled the action.
kibanaBaseUrl
The configured server.publicBaseUrl. If not configured, this will be empty.
Passing rule values to an action

Some cases exist where the variable values will be "escaped", when used in a context where escaping is needed:

  • For the Email connector, the message action configuration property escapes any characters that would be interpreted as Markdown.
  • For the Slack connector, the message action configuration property escapes any characters that would be interpreted as Slack Markdown.
  • For the Webhook connector, the body action configuration property escapes any characters that are invalid in JSON string values.

Mustache also supports "triple braces" of the form {{{variable name}}}, which indicates no escaping should be done at all. Care should be used when using this form, as it could end up rendering the variable content in such a way as to make the resulting parameter invalid or formatted incorrectly.

Each rule type defines additional variables as properties of the variable context. For example, if a rule type defines a variable value, it can be used in an action parameter as {{context.value}}.

For diagnostic or exploratory purposes, action variables whose values are objects, such as context, can be referenced directly as variables. The resulting value will be a JSON representation of the object. For example, if an action parameter includes {{context}}, it will expand to the JSON representation of all the variables and values provided by the rule type.

You can attach more than one action. Clicking the Add action button will prompt you to select another rule type and repeat the above steps again.

Actions are not required on rules. You can run a rule without actions to understand its behavior, then configure actions later.

Snooze and disable rules

edit

The rule listing enables you to quickly snooze, disable, enable, or delete individual rules. For example, you can change the state of a rule:

Use the rule status dropdown to enable or disable an individual rule

When you snooze a rule, the rule checks continue to run on a schedule but the alert will not trigger any actions. You can snooze for a specified period of time, indefinitely, or schedule single or recurring downtimes:

Snooze notifications for a rule

When a rule is in a snoozed state, you can cancel or change the duration of this state.

Rule status

edit

A rule can have one of the following statuses:

active
The conditions for the rule have been met, and the associated actions should be invoked.
ok
The conditions for the rule have not been met, and the associated actions are not invoked.
error
An error was encountered by the rule.
pending
The rule has not yet run. The rule was either just created, or enabled after being disabled.
unknown
A problem occurred when calculating the status. Most likely, something went wrong with the alerting code.

Import and export rules

edit

To import and export rules, use Saved Objects.

Some rule types cannot be exported through this interface:

Security rules can be imported and exported using the Security UI.

Stack monitoring rules are automatically created for you and therefore cannot be managed in Saved Objects.

Rules are disabled on export. You are prompted to re-enable the rule on successful import.

Rules import banner

Drill down to rule details

edit

Select a rule name from the rule listing to access the Rule details page, which tells you about the state of the rule and provides granular control over the actions it is taking.

Rule details page with three alerts

In this example, the rule detects when a site serves more than a threshold number of bytes in a 24 hour period. Four sites are above the threshold. These are called alerts - occurrences of the condition being detected - and the alert name, status, time of detection, and duration of the condition are shown in this view. Alerts come and go from the list depending on whether the rule conditions are met.

When an alert is created, it generates actions. If the conditions that caused the alert persist, the actions run again according to the rule notification settings. There are two common alert statuses:

active
The conditions for the rule are met, and actions should be generated according to the notification settings.
recovered
The conditions for the rule are no longer met, and recovery actions should be generated.

You can suppress future actions for a specific alert by turning on the Mute toggle. If a muted alert no longer meets the rule conditions, it stays in the list to avoid generating actions if the conditions recur. You can also disable a rule, which stops it from running checks and clears any alerts it was tracking. You may want to disable rules that are not currently needed to reduce the load on Kibana and Elasticsearch.

Use the disable toggle to turn off rule checks and clear alerts tracked