SLOs
editSLOs
editTo create and manage SLOs, you need an appropriate license and SLO access must be configured.
SLOs allow you to set clear, measurable targets for your service performance, based on factors like availability, response times, error rates, and other key metrics. You can define SLOs based on different types of data sources, such as custom KQL queries and APM latency or availability data.
Once you’ve defined your SLOs, you can monitor them in real time, with detailed dashboards and alerts that help you quickly identify and troubleshoot any issues that may arise. You can also track your progress against your SLO targets over time, with a clear view of your error budgets and burn rates.
Important concepts
editThe following table lists some important concepts related to SLOs:
Service-level indicator (SLI) |
The measurement of your service’s performance, such as service latency or availability. |
SLO |
The target you set for your SLI. It specifies the level of performance you expect from your service over a period of time. |
Error budget |
The amount of time that your SLI can not meet the SLO target before it violates your SLO. |
Burn rate |
The rate at which your service consumes your error budget. |
SLO overview
editFrom the SLO overview, you can see all of your SLOs and a quick summary of what’s happening in each one:
Select an SLO from the overview to see additional details including:
- Burn rate: the percentage of bad events over different time periods (1h, 6h, 24h, 72h) and the risk of exhausting your error budget within those time periods.
- Historical SLI: the SLI value and how it’s trending over the SLO time window.
- Error budget burn down: the remaining error budget and how it’s trending over the SLO time window.
- Alerts: active alerts if you’ve set any SLO burn rate alert rules for the SLO.
SLO dashboard panels
editSLO data is also available as Dashboard panels. Panels allow you to curate custom data views and visualizations to bring clarity to your data.
Available SLO panels include:
- SLO Overview: Visualize a selected SLO’s health, including name, current SLI value, target, and status.
- SLO Alerts: Visualize one or more SLO alerts, including status, rule name, duration, and reason. In addition, configure and update alerts, or create cases directly from the panel.
See Dashboard and visualizations to learn how to add panels to a Dashboard.
Upgrade from beta to GA
editStarting in version 8.12.0, SLOs are generally available (GA). If you’re upgrading from a beta version of SLOs (available in 8.11.0 and earlier), you must migrate your SLO definitions to a new format.
Migrate your SLO definitions
To migrate your SLO definitions, open the SLO overview. A banner will display the number of outdated SLOs detected. For each outdated SLO, click Reset. If you no longer need the SLO, select Delete.
If you have a large number of SLO definitions, it is possible to automate this process. To do this, you’ll need to use two Elastic APIs:
-
SLO Definitions Find API (
/api/observability/slos/_definitions
) -
SLO Reset API (
/api/observability/slos/${id}/_reset
)
Pass in includeOutdatedOnly=1
as a query parameter to the Definitions Find API.
This will display your outdated SLO definitions.
Loop through this list, one by one, calling the Reset API on each outdated SLO definition.
The Reset API loads the outdated SLO definition and resets it to the new format required for GA.
Once an SLO is reset, it will start to regenerate SLIs and summary data.
For more information on automating the update process, see framsouza/es-slo-upgrade-helper.
Use the framsouza/es-slo-upgrade-helper at your own risk. This tool is not officially maintained or supported by Elastic.
Remove legacy summary transforms
After migrating to 8.12 or later, you might have some legacy SLO summary transforms running. You can safely delete the following legacy summary transforms:
# Stop all legacy summary transforms POST _transform/slo-summary-occurrences-30d-rolling/_stop?force=true POST _transform/slo-summary-occurrences-7d-rolling/_stop?force=true POST _transform/slo-summary-occurrences-90d-rolling/_stop?force=true POST _transform/slo-summary-occurrences-monthly-aligned/_stop?force=true POST _transform/slo-summary-occurrences-weekly-aligned/_stop?force=true POST _transform/slo-summary-timeslices-30d-rolling/_stop?force=true POST _transform/slo-summary-timeslices-7d-rolling/_stop?force=true POST _transform/slo-summary-timeslices-90d-rolling/_stop?force=true POST _transform/slo-summary-timeslices-monthly-aligned/_stop?force=true POST _transform/slo-summary-timeslices-weekly-aligned/_stop?force=true # Delete all legacy summary transforms DELETE _transform/slo-summary-occurrences-30d-rolling?force=true DELETE _transform/slo-summary-occurrences-7d-rolling?force=true DELETE _transform/slo-summary-occurrences-90d-rolling?force=true DELETE _transform/slo-summary-occurrences-monthly-aligned?force=true DELETE _transform/slo-summary-occurrences-weekly-aligned?force=true DELETE _transform/slo-summary-timeslices-30d-rolling?force=true DELETE _transform/slo-summary-timeslices-7d-rolling?force=true DELETE _transform/slo-summary-timeslices-90d-rolling?force=true DELETE _transform/slo-summary-timeslices-monthly-aligned?force=true DELETE _transform/slo-summary-timeslices-weekly-aligned?force=true
Do not delete any new summary transforms used by your migrated SLOs.
Next steps
editTo get started using SLOs to measure your service performance, see the following pages: