Fleet and Elastic Agent 8.8.0

edit

Review important information about the Fleet and Elastic Agent 8.8.0 release.

Known issues

edit
Elastic Agent upgrade process can sometimes stall.

Details
Elastic Agent upgrades can sometimes stall without returning an error message, and without the agent upgrade process restarting automatically.

Impact
In this situation the agent returns from Updating to a Healthy state, but without the new version having been installed. To address this, you can trigger a new upgrade manually.

This issue is specific to version 8.8.0 and is resolved in version 8.8.1.

Elastic Agent can fail when file paths generated to represent Unix sockets exceed 103 characters.

Details
When an internally generated file path exceeds this length it is truncated using a hash, and the newly constructed path might not be accessible to the agent.

To identify the problem, check the output of elastic-agent status --output=yaml or the state.yaml file in a diagnostics bundle for output like the following:

- id: kubernetes/metrics-60f88f50-c873-11ed-9baf-09fb5640c56a
  state:
    state: 4
    message: 'Failed: pid ''3770789'' exited with code ''1'''
    units:
      ? unittype: 1
        unitid: kubernetes/metrics-60f88f50-c873-11ed-9baf-09fb5640c56a
      : state: 4
        message: 'Failed: pid ''3770789'' exited with code ''1'''
      ? unittype: 0
        unitid: kubernetes/metrics-60f88f50-c873-11ed-9baf-09fb5640c56a-kubernetes/metrics-kubelet-0d1f291d-9b2e-4f44-a0dc-82ebee865799
      : state: 4
        message: 'Failed: pid ''3770789'' exited with code ''1'''
      ? unittype: 0
        unitid: kubernetes/metrics-60f88f50-c873-11ed-9baf-09fb5640c56a-kubernetes/metrics-kube-proxy-0d1f291d-9b2e-4f44-a0dc-82ebee865799
      : state: 4
        message: 'Failed: pid ''3770789'' exited with code ''1'''
    features_idx: 0
    version_info:
      name: ""
      version: ""

This is accompanied by an error message in the logs:

logs/elastic-agent-20230530-23.ndjson:{"log.level":"error","@timestamp":"2023-05-30T11:42:46.776Z","message":"Exiting: could not start the HTTP server for the API: listen unix /tmp/elastic-agent/6dd26cab2bb93d6254d75a9ef22c5fb5d3c5ffbd8866f26288d86d2f672d2ae6.sock: bind: no such file or directory","component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"kubernetes/metrics-60f88f50-c873-11ed-9baf-08ec5473d24b","type":"kubernetes/metrics"},"log":{"source":"kubernetes/metrics-60e22e52-d872-12dc-4adf-09fb5242c26b"},"log.origin":{"file.line":1142,"file.name":"instance/beat.go"},"service.name":"metricbeat","ecs.version":"1.6.0","ecs.version":"1.6.0"}

Impact

This issue is being investigated. Until it’s resolved, as a workaround you can reduce the length of the agent output name until the problem stops occurring.

Elastic Defend and Elastic Agent CPU spike when connectivity to Elasticsearch and/or Logstash is lost.

Details

When the output server (Elasticsearch or Logstash) is unreachable, versions 8.8.0 & 8.8.1 of Elastic Defend (or Elastic Endpoint) and Elastic Agent may enter a state where they repeatedly communicate with each other indefinitely. This manifests as both processes consuming dramatically more CPU, constantly.

Versions 8.8.0 & 8.8.1 are affected on all operating systems. Elastic Agent does not manifest the behavior unless the Elastic Defend integration is enabled.

Impact

This issue was resolved in version 8.8.2. If you are using Elastic Agent with the Elastic Defend integration, please update to 8.8.2 or later.

New features

edit

The 8.8.0 release Added the following new and notable features.

Fleet
  • Added audit logging for core CRUD operations #152118
  • Added modal to display versions changelog #152082
Fleet Server
  • Documented how to run the fleet server locally #2212 #1423
  • Fleet Server now supports file uploads for a limited subset of integrations #1902
  • Extended the Fleet Server actions schema to support signed actions passing to the agent as a part of the agent tamper protection. #2353
  • Fleet Server can now be run in stand-alone mode without needing to check into Kibana #2359 #2351
  • Added support for gathering secret values from files #2459
  • Added action APM metadata to help debug agent actions #2472
Elastic Agent

Enhancements

edit
Fleet
  • Added overview dashboards in fleet #154914
  • Added raw status to Agent details UI #154826
  • Added support for dynamic_namespace and dynamic_dataset #154732
  • Added the ability to show pipelines and mappings editor for input packages #154077
  • Added placeholder to integration select field #153927
  • Added the ability to show integration subcategories #153591
  • Added the ability to create and update the package policy API return 409 conflict when names are not unique #153533
  • Added the ability to display policy changes in Agent activity #153237
  • Added the ability to display errors in Agent activity with link to Logs #152583
  • Added support for select type in integrations #152550
  • Added the ability to make spaces plugin optional #152115
  • Added proxy ssl key and certificate to agent policy #152005
  • Added _meta field has_experimental_data_stream_indexing_features #151853
  • Added the ability to create templates and pipelines when updating package of a single package policy from type integration to input #150199
  • Added user’s secondary authorization to Transforms #154665
  • Added support for the Cloud Defend application to Elastic Agent #2477
  • Disabled signature validation in Elastic Agent so that only Endpoint Security validates policies and actions #2562
Fleet Server
  • Replaced upgrade expiration and minimum_execution_duration with rollout_duration_seconds` #2243
  • Added a poll_timeout attribute to check in requests that the client can use to inform Fleet Server of how long the client will hold the polling connection open for #2491 #2337
  • Added a memory_limit configuration setting to help prevent OOM errors #2514
Elastic Agent
  • Make download of Elastic Agent upgrade artifacts asynchronous during Fleet-managed upgrade and increase the download timeout to 2 hours #2205 #1706
  • Make the language used in CLI commands more consistent #2496

Bug fixes

edit
Fleet
  • Fixes package license check to use new conditions.elastic.subscription field #154831
  • Fixes the OpenAPI spec from /agent/upload to /agent/uploads for Agent uploads API #151722
Fleet Server
  • Filter out unused UPDATE_TAGS and FORCE_UNENROLL actions from being delivered to Elastic Agent #2200
  • Ignore the unenroll_timeout field on agent policies as it has been replaced by a configurable inactivity timeout #2096 #2063
  • Fixed Fleet Server discarding duplicate server keys input when creating configuration from a policy #2354 #2303
  • Fleet Server will no longer restart subsystems like API listeners and the Elasticsearch client when the log level changes #2454 #2453
Elastic Agent
  • Fixed the formatting of system metricsets in example Elastic Agent configuration file #2338
  • Fixed the parsing of paths from the container-paths.yml file #2340
  • Added a check to ensure that Elastic Agent was bootstrapped with the --fleet-server-* options #2505 #2170
  • Fixed an issue where inspect and diagnostics didn’t include the local Elastic Agent configuration #2529 #2390
  • Fixed a bug that caused heap profiles captured in the agent diagnostics to be unusable #2549 #2530
  • Fix an issue that occurs when specifing a FLEET_SERVER_SERVICE_TOKEN_PATH with the agent running in a Docker container where both the token value and path are passed in the enroll section of the agent setup #2576