Create a service-level objective (SLO) burn rate rule

edit

Create a service-level objective (SLO) burn rate rule

edit

To create and manage SLOs, you need an appropriate license, an Elasticsearch cluster with both transform and ingest node roles present, and SLO access must be configured.

You can create a SLO burn rate rule to get alerts when the burn rate is above a defined threshold for two different lookback periods: a long period and a short period that is 1/12th of the long period. For example, if your long lookback period is one hour, your short lookback period is five minutes.

For each lookback period, the burn rate is computed as the error rate divided by the error budget. When the burn rates for both periods surpass the threshold, an alert is triggered.

When you use the UI to create an SLO, a default SLO burn rate alert rule is created automatically. The burn rate rule will use the default configuration and no connector. You must configure a connector if you want to receive alerts for SLO breaches.

To create an SLO burn rate rule, go to Observability → SLOs. Click the more options icon to the right of the SLO you want to add a burn rate rule for, and select Create new alert rule from the drop-down menu:

create new alert rule menu

To create your SLO burn rate rule:

  1. Set your long lookback period under Lookback period (hours). Your short lookback period is set automatically.
  2. Set your Burn rate threshold. Under this field, you’ll see how long you have until your error budget is exhausted.
  3. Set how often the condition is evaluated in the Check every field.
  4. Optionally, change the number of consecutive runs that must meet the rule conditions before an alert occurs in the Advanced options.
Action types
edit

Extend your rules by connecting them to actions that use the following supported built-in integrations. Actions are Kibana services or integrations with third-party systems that run as background tasks on the Kibana server when rule conditions are met.

You can configure action types on the Settings page.

Some connector types are paid commercial features, while others are free. For a comparison of the Elastic subscription levels, go to the subscription page.

After you select a connector, you must set the action frequency. You can choose to create a Summary of alerts on each check interval or on a custom interval. For example, you can send email notifications that summarize the new, ongoing, and recovered alerts every twelve hours.

Alternatively, you can set the action frequency to For each alert and specify the conditions each alert must meet for the action to run. For example, you can send an email only when the alert status changes to critical.

Configure when a rule is triggered
Action variables
edit

Use the default notification message or customize it. You can add more context to the message by clicking the icon above the message text box and selecting from a list of available variables.

Action variables with default SLO message

The following variables are specific to this rule type. You an also specify variables common to all rules.

context.alertDetailsUrl
Link to the alert troubleshooting view for further context and details. This will be an empty string if the server.publicBaseUrl is not configured.
context.burnRateThreshold
The burn rate threshold value.
context.longWindow
The window duration with the associated burn rate value.
context.reason
A concise description of the reason for the alert.
context.shortWindow
The window duration with the associated burn rate value.
context.sloId
The SLO unique identifier.
context.sloInstanceId
The SLO instance id.
context.sloName
The SLO name.
context.timestamp
A timestamp of when the alert was detected.
context.viewInAppUrl
The URL to the SLO details page to help with further investigation.
Alert recovery
edit

To receive a notification when the alert recovers, select Run when Recovered. Use the default notification message or customize it. You can add more context to the message by clicking the icon above the message text box and selecting from a list of available variables.

Default recovery message for Uptime duration anomaly rules with open "Add variable" popup listing available action variables
Next steps
edit

Learn how to view alerts and triage SLO burn rate breaches: