Elastic APM known issues

Known issues are significant defects or limitations that may impact your implementation. These issues are actively being worked on and will be addressed in a future release. Review the Elastic APM known issues to help you make informed decisions, such as upgrading to a new version.

Elastic Stack versions: 8.15.0, 8.15.1, 8.15.2, 8.15.3
Fixed in Elastic Stack version 8.15.4

The issue only occurs when upgrading the Elastic Stack from 8.12.2 or lower directly to any 8.15.x version prior to 8.15.4. The issue does not occur when creating a new cluster using any 8.15.x version, or when upgrading from 8.12.2 to 8.13.x or 8.14.x and then to 8.15.x.

In APM Servers versions prior to 8.13.0, an ingestion pipeline exists to perform a check on the version. The version check would fail any APM document produced with a different version of APM server compared to the version of the installed APM’s ingest pipeline. In 8.13.0 the version check in the ingest pipeline was removed. Due to the combination of an internal change in how apm data management assets are set up from 8.15 onwards and a bug in Elasticsearch, related to lazy rollover of data streams, the ingestion pipeline conducting the version check is not removed on upgrade and prevents the ingestion of data.

If the deployment is running 8.15.0, upgrade the deployment to 8.15.1 or above. A manual rollover of all APM data streams is required to pick up the new index templates and remove the faulty ingest pipeline version check. Perform the following requests to Elasticsearch (they are assuming the default namespace is used, adjust if necessary):

		POST /traces-apm-default/_rollover
POST /traces-apm.rum-default/_rollover
POST /logs-apm.error-default/_rollover
POST /logs-apm.app-default/_rollover
POST /metrics-apm.app-default/_rollover
POST /metrics-apm.internal-default/_rollover
POST /metrics-apm.service_destination.1m-default/_rollover
POST /metrics-apm.service_destination.10m-default/_rollover
POST /metrics-apm.service_destination.60m-default/_rollover
POST /metrics-apm.service_summary.1m-default/_rollover
POST /metrics-apm.service_summary.10m-default/_rollover
POST /metrics-apm.service_summary.60m-default/_rollover
POST /metrics-apm.service_transaction.1m-default/_rollover
POST /metrics-apm.service_transaction.10m-default/_rollover
POST /metrics-apm.service_transaction.60m-default/_rollover
POST /metrics-apm.transaction.1m-default/_rollover
POST /metrics-apm.transaction.10m-default/_rollover
POST /metrics-apm.transaction.60m-default/_rollover
		
	

Elastic Stack versions: 8.15.0
Fixed in Elastic Stack version 8.15.1

The issue only occurs when upgrading the Elastic Stack to 8.15.0. The issue does not occur when creating a new cluster using 8.15.0. The issue also does not occur if a custom ILM policy is configured using a custom component template.

In 8.15.0, APM Server switched to use data stream lifecycle to manage data retention for APM indices for new deployments as well as for upgraded deployments with default lifecycle configurations. Unfortunately, since any data stream created before 8.15.0 does not have a data stream lifecycle configuration, such existing data streams become unmanaged for default lifecycle configurations.

Upgrading to 8.15.1 resolves the lifecycle issue for any new indices created for APM data streams. However, indices created in version 8.15.0 will remain unmanaged if the default ILM policy is in place. To fix these unmanaged indices, consider one of the following approaches:

Manually delete the unmanaged indices when they are no longer needed.
Explicitly configure APM data streams to use the default data stream lifecycle configuration. This approach migrates all affected data streams to use data stream lifecycles, maintaining behavior equivalent to the default ILM policies. Apply this fix only to data streams that have unmanaged indices due to missing default ILM policies.
```
PUT _data_stream/{{data_stream_name}}-{{data_stream_namespace}}/_lifecycle
{
  "data_retention": <data_retention_period>
}
		
```

Default <data_retention_period> for each data stream is available in this guide.

This issue is fixed in 8.15.1 (elastic/elasticsearch#112432).

Elastic Stack versions: 8.13.0, 8.13.1, 8.13.2
Fixed in Elastic Stack version 8.13.3

This issue occurs when upgrading the Elastic Stack to version 8.13.0, 8.13.1, or 8.13.2. This issue may go unnoticed unless you actively monitor your Kibana logs. The following log indicates the presence of this issue:

		"params invalid: [anomalyDetectorTypes]: expected value of type [array] but got [undefined]"
		
	

This issue occurs because a non-optional parameter, anomalyDetectorTypes was added in 8.13.0 without the presence of an automation migration script. This breaks pre-existing rules as they do not have this parameter and will fail validation. This issue is fixed in v8.13.3.

There are three ways to fix this error:

Upgrade to version 8.13.3
Fix broken anomaly rules in the APM UI (no upgrade required)
Fix broken anomaly rules with Kibana APIs (no upgrade required)

Fix broken anomaly rules in the APM UI

From any APM page in Kibana, select Alerts and rules → Manage rules.
Filter your rules by setting Type to APM Anomaly.
For each anomaly rule in the list, select the pencil icon to edit the rule.
Add one or more DETECTOR TYPES to the rule.

The detector type determines when the anomaly rule triggers. For example, a latency anomaly rule will trigger when the latency of the service being monitored is abnormal. Supported detector types are latency, throughput, and failed transaction rate.
Click Save.

Fix broken anomaly rules with Kibana APIs

Find broken rules

Note

To identify rules in this exact state, you can use the find rules endpoint and search for the APM anomaly rule type as well as this exact error message indicating that the rule is in the broken state. We will also use the fields parameter to specify only the fields required when making the update request later.

search_fields=alertTypeId
search=apm.anomaly
filter=alert.attributes.executionStatus.error.message:"params invalid: [anomalyDetectorTypes]: expected value of type [array] but got [undefined]"
fields=[id, name, actions, tags, schedule, notify_when, throttle, params]

The encoded request might look something like this:

		curl -u "$KIBANA_USER":"$KIBANA_PASSWORD" "$KIBANA_URL/api/alerting/rules/_find?search_fields=alertTypeId&search=apm.anomaly&filter=alert.attributes.executionStatus.error.message%3A%22params%20invalid%3A%20%5BanomalyDetectorTypes%5D%3A%20expected%20value%20of%20type%20%5Barray%5D%20but%20got%20%5Bundefined%5D%22&fields=id&fields=name&fields=actions&fields=tags&fields=schedule&fields=notify_when&fields=throttle&fields=params"
		
	

		{
  "page": 1,
  "total": 1,
  "per_page": 10,
  "data": [
    {
      "id": "d85e54de-f96a-49b5-99d4-63956f90a6eb",
      "name": "APM Anomaly Jason Test FAILING [2]",
      "tags": [
        "test",
        "jasonrhodes"
      ],
      "throttle": null,
      "schedule": {
        "interval": "1m"
      },
      "params": {
        "windowSize": 30,
        "windowUnit": "m",
        "anomalySeverityType": "warning",
        "environment": "ENVIRONMENT_ALL"
      },
      "notify_when": null,
      "actions": []
    }
  ]
}
		
	

Prepare the update JSON doc(s)
Note
For each broken rule found, create a JSON rule document with what was returned from the API in the previous step. You will need to make two changes to each document:
1. Remove the id key but keep the value connected to this document (e.g. rename the file to {{id}}.json). The id cannot be sent as part of the request body for the PUT request, but you will need it for the URL path.
2. Add the "anomalyDetectorTypes" to the "params" block, using the default value as seen below to mimic the pre-8.13 behavior:
  { "params": { // ... other existing params should stay here, // with the required one added to this object "anomalyDetectorTypes": [ "txLatency", "txThroughput", "txFailureRate" ] } }
Update each rule using the PUT /api/alerting/rule/{{id}} API
Note
For each rule, submit a PUT request to the update rule endpoint using that rule’s ID and its stored update document from the previous step. For example, assuming the first broken rule’s ID is 046c0d4f:
curl -u "$KIBANA_USER":"$KIBANA_PASSWORD" -XPUT "$KIBANA_URL/api/alerting/rule/046c0d4f" -H 'Content-Type: application/json' -H 'kbn-xsrf: rule-update' -d @046c0d4f.json
Once the PUT request executes successfully, the rule will no longer be broken.

APM Server versions: 8.11.0, 8.11.1
Elastic APM Java agent versions: 1.39.0+

If you’re using the Elastic APM Java agent v1.39.0+ to send new JVM metrics to APM Server v8.9.x and v8.10.x, upgrading to 8.11.0 or 8.11.1 will silently fail and stop ingesting APM metrics.

After upgrading, you will see the following errors:

APM Server error logs:

		failed to index document in 'metrics-apm.internal-default' (fail_processor_exception): Document produced by APM Server v8.11.1, which is newer than the installed APM integration (v8.10.3-preview-1695284222). The APM integration must be upgraded.
		
	

Fleet error on integration package upgrade:

		Failed installing package [apm] due to error: [ResponseError: mapper_parsing_exception
	Root causes:
		mapper_parsing_exception: Field [jvm.memory.non_heap.pool.committed] attempted to shadow a time_series_metric]
		
	

A fix was released in 8.11.2: elastic/kibana#171712.

APM Server versions: <= 8.12.1 +

If you’re upgrading APM integration package to any versions <= 8.12.1, in some rare cases, the upgrade fails with a mapping conflict error. The upgrade process keeps rolling over the data stream in an unsuccessful attempt to work around the error. As a result, many empty backing indices for APM data streams are created.

During upgrade, you will see errors similar to the one below:

Fleet error on integration package upgrade:

		Mappings update for metrics-apm.service_destination.10m-default failed due to ResponseError: illegal_argument_exception
	Root causes:
		illegal_argument_exception: Mapper for [metricset.interval] conflicts with existing mapper:
	Cannot update parameter [value] from [10m] to [null]
		
	

A fix was released in 8.12.2: elastic/apm-server#12219.

APM Server versions: >=8.13.0, <= 8.14.2

If you’re on APM server version >=8.13.0, <= 8.14.2_, using Elasticsearch output, do not specify any output.elasticsearch.flush_bytes, and do not disable compression explicitly by setting output.elasticsearch.compression_level to 0, APM server will issue smaller bulk requests of 24KB size, and more bulk requests will need to be made to maintain the original throughput. This causes Elasticsearch to experience higher load, and APM server may exhibit Elasticsearch backpressure symptoms.

This happens because a performance regression was introduced, such that the default value of bulk indexer flush bytes was reduced from 1MB to 24KB.

Affected APM servers will emit the following log:

		flush_bytes config value is too small (0) and might be ignored by the indexer, increasing value to 24576
		
	

To work around the issue, modify the Elasticsearch output configuration in APM.

For APM Server binary
- In apm-server.yml, set output.elasticsearch.flush_bytes: 1mib
For Fleet-managed APM (non-Elastic Cloud)
- In Fleet, open the Settings tab.
- Under Outputs, identify the Elasticsearch output that receives from APM, select the edit icon.
- In the Edit output flyout, in "Advanced YAML configuration" field, add line flush_bytes: 1mib.
For Elastic Cloud
- It is not possible to edit the Fleet "Elastic Cloud internal output".

A fix will be released in 8.14.3: elastic/apm-server#13576.