Host metrics

edit

If you haven’t already, you need to install and configure Metricbeat to populate the Infrastructure app with data. For more information, see Ingest metrics.

Host metrics are ingested using the Metricbeat system module, which is enabled by default, and become available for analysis in the Infrastructure app.

To help you analyze the host metrics listed on the Inventory page, you can select view filters based on the following predefined metrics or you can add custom metrics.

CPU Usage

Average of system.cpu.user.pct added to the average of system.cpu.system.pct divided by system.cpu.cores.

Memory Usage

Average of system.memory.actual.used.pct.

Load

Average of system.load.5.

Inbound Traffic

Derivative of the maximum of host.network.ingress.bytes scaled to a 1 second rate.

Outbound Traffic

Derivative of the maximum of host.network.egress.bytes scaled to a 1 second rate.

Log Rate

Derivative of the cumulative sum of the document count scaled to a 1 second rate. This metric relies on the same indices as the logs.

For information about which required fields the Infrastructure app uses to display host metrics, see the Infrastructure app field reference.

Host details

edit

Without leaving the Inventory page, you can view enhanced details relating to each host running in your infrastructure. On the waffle map, select the host to display the host details overlay.

The host details overlay contains the following tabs:

Metrics
Host metrics

The Metrics tab displays CPU, load, memory, and network metrics relating to the host, along with the log rate and any custom metric that you have defined. You can change the time range to view metrics over the last 15 minutes, hour, 3 hours, 24 hours, or over the previous seven days. You can also hover over a specific time period on a chart to compare the various metrics at that given time.

CPU

Averages of system.cpu.user.pct divided by system.cpu.cores and system.cpu.system.pct divided by system.cpu.cores.

Load

Averages of system.load.1, system.load.5, and system.load.15.

Memory

For Linux systems, memory used is the average of system.memory.actual.used.bytes and memory free is the average of system.memory.actual.free.

For non-Linux systems, memory used is the average of system.memory.used.bytes and memory free is the average of system.memory.free.

Network

Rates of host.network.ingress.bytes and host.network.egress.bytes.

Log Rate

Derivative of the cumulative sum of the document count scaled to a 1 second rate. This metric relies on the same indices as the logs.

Custom metric

A chart is displayed for each custom metric that you have added and defined on the Inventory page.

Logs
Host logs

The Logs tab displays logs relating to the host that you have selected. By default, the logs tab displays the following columns.

Timestamp

The timestamp of the log entry from the timestamp field.

Message

The message extracted from the document. The content of this field depends on the type of log message. If no special log message type is detected, the Elastic Common Schema (ECS) base field, message, is used.

You can customize the logs view by adding a column for an arbitrary field you would like to filter by. For more information, see Customize Stream. To view the logs in the Logs app for a detailed analysis, click Open in Logs.

Processes
Host processes

The Processes tab lists the total number of processes (system.process.summary.total) running on the host, along with the total number of processes in these various states:

  • Running (system.process.summary.running)
  • Sleeping (system.process.summary.sleeping)
  • Stopped (system.process.summary.stopped)
  • Idle (system.process.summary.idle)
  • Dead (system.process.summary.dead)
  • Zombie (system.process.summary.zombie)
  • Unknown (system.process.summary.unknown)

The processes listed in the Top processes table are based on an aggregation of the top CPU and the top memory consuming processes. The number of top processes is controlled by process.include_top_n.by_cpu and process.include_top_n.by_memory.

Command

Full command line that started the process, including the absolute path to the executable, and all the arguments (system.process.cmdline).

PID

Process id (process.pid).

User

User name (user.name).

CPU

The percentage of CPU time spent by the process since the last event (system.process.cpu.total.pct).

Time

The time the process started (system.process.cpu.start_time).

Memory

The percentage of memory (system.process.memory.rss.pct) the process occupied in main memory (RAM).

State

The current state of the process and the total number of processes (system.process.state). Expected values are: running, sleeping, dead, stopped, idle, zombie, and unknown.

Metadata
Host metadata

The Metadata tab lists all the meta information relating to the host:

  • Host information
  • Cloud information
  • Agent information

All of this information can help when investigating events—for example, filtering by operating system or architecture.

Anomalies
Anomalies

The Anomalies table displays a list of each single metric anomaly detection job for the specific host. By default, anomaly jobs are sorted by time to show the most recent job.

Along with the name of each anomaly job, detected anomalies with a severity score equal to 50, or higher, are listed. These scores represent a severity of "warning" or higher in the selected time period. The summary value represents the increase between the actual value and the expected ("typical") value of the host metric in the anomaly record result.

To drill down and analyze the metric anomaly, select Actions > Open in Anomaly Explorer to view the Anomaly Explorer in Machine Learning. You can also select Actions > Show in Inventory to view the host Inventory page, filtered by the specific metric.

Osquery

You must have an active Elastic Agent with an assigned agent policy that includes the Osquery Manager integration and have Osquery Kibana privileges as a user.

Osquery

The Osquery tab allows you to build SQL statements to query your host data. You can create and run live or saved queries against the Elastic Agent. Osquery results are stored in Elasticsearch so that you can use the Elastic Stack to search, analyze, and visualize your host metrics. To create saved queries and add scheduled query groups, see Osquery.

In the example above, we query for the top 5 memory hogs running on the host. Under the Results tab, the total virtual memory size (total_size renamed to memory_used to be a little more user friendly) is returned in descending order, along with the process ID (pid), and the process path (name).

To view more information about the query, click the Status tab. A query status can result in success, error (along with an error message), or pending (if the Elastic Agent is offline).

Other options include:

  • View in Discover to search, filter, and view information about the structure of host metric fields. To learn more, see Discover.
  • View in Lens to create visualizations based on your host metric fields. To learn more, see Lens.
  • View the results in full screen mode.
  • Add, remove, reorder, and resize columns.
  • Sort field names in ascending or descending order.