High CPU usage
editHigh CPU usage
editElasticsearch uses thread pools to manage CPU resources for concurrent operations. High CPU usage typically means one or more thread pools are running low.
If a thread pool is depleted, Elasticsearch will reject requests
related to the thread pool. For example, if the search
thread pool is
depleted, Elasticsearch will reject search requests until more threads are available.
You might experience high CPU usage if a data tier, and therefore the nodes assigned to that tier, is experiencing more traffic than other tiers. This imbalance in resource utilization is also known as hot spotting.
Diagnose high CPU usage
editCheck CPU usage
You can check the CPU usage per node using the cat nodes API:
resp = client.cat.nodes( v=True, s="cpu:desc", ) print(resp)
response = client.cat.nodes( v: true, s: 'cpu:desc' ) puts response
const response = await client.cat.nodes({ v: "true", s: "cpu:desc", }); console.log(response);
GET _cat/nodes?v=true&s=cpu:desc
The response’s cpu
column contains the current CPU usage as a percentage.
The name
column contains the node’s name. Elevated but transient CPU usage is
normal. However, if CPU usage is elevated for an extended duration, it should be
investigated.
To track CPU usage over time, we recommend enabling monitoring:
-
(Recommended) Enable logs and metrics. When logs and metrics are enabled, monitoring information is visible on Kibana’s Stack Monitoring page.
You can also enable the CPU usage threshold alert to be notified about potential issues through email.
-
From your deployment menu, view the Performance page. On this page, you can view two key metrics:
- CPU usage: Your deployment’s CPU usage, represented as a percentage.
- CPU credits: Your remaining CPU credits, measured in seconds of CPU time.
Elasticsearch Service grants CPU credits per deployment to provide smaller clusters with performance boosts when needed. High CPU usage can deplete these credits, which might lead to performance degradation and increased cluster response times.
-
Enable Elasticsearch monitoring. When logs and metrics are enabled, monitoring information is visible on Kibana’s Stack Monitoring page.
You can also enable the CPU usage threshold alert to be notified about potential issues through email.
Check hot threads
If a node has high CPU usage, use the nodes hot threads API to check for resource-intensive threads running on the node.
const response = await client.nodes.hotThreads(); console.log(response);
GET _nodes/hot_threads
This API returns a breakdown of any hot threads in plain text. High CPU usage frequently correlates to a long-running task, or a backlog of tasks.
Reduce CPU usage
editThe following tips outline the most common causes of high CPU usage and their solutions.
Scale your cluster
Heavy indexing and search loads can deplete smaller thread pools. To better handle heavy workloads, add more nodes to your cluster or upgrade your existing nodes to increase capacity.
Spread out bulk requests
While more efficient than individual requests, large bulk indexing or multi-search requests still require CPU resources. If possible, submit smaller requests and allow more time between them.
Cancel long-running searches
Long-running searches can block threads in the search
thread pool. To check
for these searches, use the task management API.
resp = client.tasks.list( actions="*search", detailed=True, ) print(resp)
response = client.tasks.list( actions: '*search', detailed: true ) puts response
const response = await client.tasks.list({ actions: "*search", detailed: "true", }); console.log(response);
GET _tasks?actions=*search&detailed
The response’s description
contains the search request and its queries.
running_time_in_nanos
shows how long the search has been running.
{ "nodes" : { "oTUltX4IQMOUUVeiohTt8A" : { "name" : "my-node", "transport_address" : "127.0.0.1:9300", "host" : "127.0.0.1", "ip" : "127.0.0.1:9300", "tasks" : { "oTUltX4IQMOUUVeiohTt8A:464" : { "node" : "oTUltX4IQMOUUVeiohTt8A", "id" : 464, "type" : "transport", "action" : "indices:data/read/search", "description" : "indices[my-index], search_type[QUERY_THEN_FETCH], source[{\"query\":...}]", "start_time_in_millis" : 4081771730000, "running_time_in_nanos" : 13991383, "cancellable" : true } } } } }
To cancel a search and free up resources, use the API’s _cancel
endpoint.
resp = client.tasks.cancel( task_id="oTUltX4IQMOUUVeiohTt8A:464", ) print(resp)
response = client.tasks.cancel( task_id: 'oTUltX4IQMOUUVeiohTt8A:464' ) puts response
const response = await client.tasks.cancel({ task_id: "oTUltX4IQMOUUVeiohTt8A:464", }); console.log(response);
POST _tasks/oTUltX4IQMOUUVeiohTt8A:464/_cancel
For additional tips on how to track and avoid resource-intensive searches, see Avoid expensive searches.