Health API
editHealth API
editAn experimental API that returns the health status of an Elasticsearch cluster.
This API is currently experimental for internal use by Elastic software only.
This feature is designed for indirect use by Elasticsearch Service, Elastic Cloud Enterprise, and Elastic Cloud on Kubernetes. Direct use is not supported.
Request
editGET /_internal/_health
GET /_internal/_health/<component>
GET /_internal/_health/<component>/<indicator>
Prerequisites
edit-
If the Elasticsearch security features are enabled, you must have the
monitor
ormanage
cluster privilege to use this API.
Description
editThe health API returns the health status of an Elasticsearch cluster. It returns a list of components that compose Elasticsearch functionality. Each component’s health is determined by health indicators associated with the component.
Each indicator has a health status of: green
, unknown
, yellow
or red
. The indicator will
provide an explanation and metadata describing the reason for its current health status.
A component’s status is controlled by the worst indicator status. The cluster’s status is controlled by the worst component status.
In the event that an indicator’s status is non-green, a list of impacts may be present in the indicator result which detail the functionalities that are negatively affected by the health issue. Each impact carries with it a severity level, an area of the system that is affected, and a simple description of the impact on the system.
Some health indicators can determine the root cause of a health problem and prescribe a set of user actions that can be performed in order to improve the health of the system. User actions contain a message detailing a root cause analysis, the list of affected resources (if applicable) and steps to take that will improve the health of your cluster. User actions can also optionally provide URLs to troubleshooting documentation.
Path parameters
edit-
<component>
-
(Optional, string) Limits the information returned to the specific component. A comma-separated list of the following options:
-
cluster_coordination
- All health reporting that pertains to a single, stable elected master which coordinates other nodes to ensure they all have a consistent view of the cluster state. If your cluster doesn’t have a stable master, many of its features won’t work correctly. You must fix the master node’s instability before addressing these other issues. It will not be possible to solve any other issues while the master node is unstable.
-
data
- All health reporting that pertains to data availability, lifecycle, and responsiveness.
-
snapshot
- All health reporting that pertains to data archival, backups, and searchable snapshots.
-
-
<indicator>
-
(Optional, string) Limit the information returned for a
component
to a specific indicator. An indicator can be used only if its correspondingcomponent
is specified. Supported indicators and their corresponding components are:-
instance_has_master
-
(Component
cluster_coordination
) Reports health issues regarding connections to the master node. -
shards_availability
-
(Component
data
) Reports health issues regarding shard assignments. -
ilm
-
(Component
data
) Reports health issues related to Indexing Lifecycle Management. -
repository_integrity
-
(Component
snapshot
) Tracks repository integrity and reports health issues that arise if repositories become corrupted. -
slm
-
(Component
snapshot
) Reports health issues related to Snapshot Lifecycle Management.
-
Query parameters
edit-
explain
-
(Optional, Boolean) If
true
, the response includes additional details that help explain the status of each non-green indicator. These details include additional troubleshooting metrics and sometimes a root cause analysis of a health status. Defaults totrue
.
Some health indicators perform root cause analysis of non-green health statuses. This can
be computationally expensive when called frequently. When setting up automated polling of the API
for health status set explain
to false
to disable the more expensive analysis logic.
Response body
edit-
cluster_name
- (string) The name of the cluster.
-
status
-
(Optional, string) Health status of the cluster, based on the aggregated status of all components in the cluster. If the health of a specific component or indicator is being requested, this top level status will be omitted. Statuses are:
-
green
- The cluster is healthy.
-
unknown
- The health of the cluster could not be determined.
-
yellow
-
The functionality of a cluster is in a degraded state and may need remediation
to avoid the health becoming
red
. -
red
- The cluster is experiencing an outage or certain features are unavailable for use.
-
-
components
-
(object) Information about the health of the cluster components.
Properties of
components
-
<component>
-
(object) Contains health results for a component.
Properties of
<component>
-
status
-
(Optional, string) Health status of the component, based on the aggregated status of all indicators in the component. If only the health of a specific indicator is being requested, this component level status will be omitted. The component status is not displayed in this case in order to avoid reporting a false component status given that not all indicators are evaluated. Statuses are:
-
green
- The component is healthy.
-
unknown
- The health of the component could not be determined.
-
yellow
-
The functionality of a component is in a degraded state and may need remediation
to avoid the health becoming
red
. -
red
- The component is experiencing an outage or certain features are unavailable for use.
-
-
indicators
-
(object) Information about the health of the indicators under a component
Properties of
indicators
-
<indicator>
-
(object) Contains health results for an indicator.
Properties of
<indicator>
-
status
-
(string) Health status of the indicator. Statuses are:
-
green
- The indicator is healthy.
-
unknown
- The health of the indicator could not be determined.
-
yellow
-
The functionality of an indicator is in a degraded state and may need remediation
to avoid the health becoming
red
. -
red
- The indicator is experiencing an outage or certain features are unavailable for use.
-
-
summary
- (string) A message providing information about the current health status.
-
help_url
- (Optional, string) A link to additional troubleshooting guides for this indicator.
-
details
-
(Optional, object) An object that contains additional information about the cluster that
has lead to the current health status result. This data is unstructured, and each
indicator may return a unique set of details. Details will not be calculated if the
explain
property is set to false. -
impacts
-
(Optional, array) If a non-healthy status is returned, indicators may include a list of impacts that this health status will have on the cluster.
Properties of
impacts
-
severity
- (integer) How important this impact is to the functionality of the cluster. A value of 1 is the highest severity, with larger values indicating lower severity.
-
description
- (string) A description of the impact on the cluster.
-
impact_areas
-
(array of strings) The areas of cluster functionality that this impact affects. Possible values are:
-
search
-
ingest
-
backup
-
deployment_management
-
-
-
user_actions
-
(Optional, array) If a non-healthy status is returned, indicators may include a list of user actions to take in order to remediate the health issue. User actions and root cause analysis will not be calculated if the
explain
property is false.Properties of
user_actions
-
message
- (string) A description of a root cause of this health status and the steps that should be taken to remediate the problem.
-
affected_resources
- (Optional, array of strings) If the root cause pertains to multiple resources in the cluster (like indices, shards, nodes, etc…) this will hold all resources that this user action is applicable for.
-
help_url
- (string) A link to additional troubleshooting information for this user action.
-
-
-
-
-
Examples
editGET _internal/_health
The API returns a response with all the components and indicators regardless of current status.
GET _internal/_health/data
The API returns a response with just the data component.
GET _internal/_health/data/shards_availability
The API returns a response for just the shard availability indicator within the data component.
GET _internal/_health?explain=false
The API returns a response with all components and indicators health but will not calculate details or root cause analysis for the response. This is helpful if you would like to monitor the health API and do not want the overhead of calculating additional troubleshooting details each call.