Common problems
editCommon problems
editThis section describes common problems you might encounter with APM Server.
No data is indexed
editIf no data shows up in Elasticsearch, first check that the APM components are properly connected.
To ensure that APM Server configuration is valid and it can connect to the configured output, Elasticsearch by default, run the following commands:
apm-server test config apm-server test output
To see if the agent can connect to the APM Server, send requests to the instrumented service and look for lines
containing [request]
in the APM Server logs.
If no requests are logged, it might be that SSL is misconfigured or that the host is wrong.
Particularly, if you are using Docker, ensure to bind to the right interface (for example, set
apm-server.host = 0.0.0.0:8200
to match any IP) and set the SERVER_URL
setting in the agent accordingly.
If you see requests coming trough the APM Server but they are not accepted (response code other than 202
), consider
the response code to narrow down the possible causes (see sections bellow).
Another reason for data not showing up is that the agent is not auto-instrumenting something you were expecting, check the agent documentation for details on what is automatically instrumented.
HTTP 400: Data decoding error / Data validation error
editThe most likely cause for this is that you are using non compatible versions of agent and APM Server. For instance, APM Server 6.2.0 changed the Intake API spec and requires a minimum version of each agent.
HTTP 401: Invalid token
editThe secret token in the request header doesn’t match the configured in the APM Server.
HTTP 403: Forbidden request
editEither you are sending requests to a RUM endpoint without RUM enabled, or a request
is coming from an origin not whitelisted in apm-server.rum.allow_origins
. See the RUM configuration.
HTTP 413: Request body too large
editThe agent is collecting too much data and sending it all at once. Consider increasing the apm-server.max_unzipped_size
setting in the APM Server, and adjust relevant settings in the agent.
HTTP 503: Queue is full
editAPM Server has an internal queue that helps to:
- buffer data temporarily if Elasticsearch is intermittently unavailable
- handle sudden large spikes of data
- send documents to Elasticsearch in bulk, instead of individually
When the queue has reached the maximum size, APM Server returns an HTTP 503 status with the message "Queue is full".
A full queue generally means that the agents collect more data than APM server is able to process. This might happen when APM Server is not configured properly for the size of your Elasticsearch cluster, or because your Elasticsearch cluster is underpowered or not configured properly for the given workload.
The queue can also fill up if Elasticsearch runs out of disk space.
If the APM Server only returns 503 responses, it indicates that an Elasticsearch disk might be full. If the APM Server returns interleaved 503 and 202 responses, it indicates that the APM Server can’t process that much data.
You have a few options to solve this problem:
HTTP 503: Request timed out waiting to be processed
editThis happens when APM Server exceeds the maximum number of requests that it can process concurrently.
This limit is determined by the apm-server.concurrent_requests
configuration parameter.
To alleviate this problem, you can try to:
SSL client fails to connect
editThe target host running might be unreachable or the certificate may not be valid. To resolve your issue:
-
Make sure that server process on the target host is running and you can connect to it. First, try to ping the target host to verify that you can reach it from the host running APM Server. Then use either
nc
ortelnet
to make sure that the port is available. For example:ping <hostname or IP> telnet <hostname or IP> 5044
- Verify that the certificate is valid and that the hostname and IP match.
- Use OpenSSL to test connectivity to the target server and diagnose problems. See the OpenSSL documentation for more info.
Common SSL-Related Errors and Resolutions
editHere are some common errors and ways to fix them:
x509: cannot validate certificate for <IP address> because it doesn’t contain any IP SANs
editThis happens because your certificate is only valid for the hostname present in the Subject field.
To resolve this problem, try one of these solutions:
- Create a DNS entry for the hostname mapping it to the server’s IP.
-
Create an entry in
/etc/hosts
for the hostname. Or on Windows add an entry toC:\Windows\System32\drivers\etc\hosts
. - Re-create the server certificate and add a SubjectAltName (SAN) for the IP address of the server. This make the server’s certificate valid for both the hostname and the IP address.
getsockopt: no route to host
editThis is not a SSL problem. It’s a networking problem. Make sure the two hosts can communicate.
getsockopt: connection refused
editThis is not a SSL problem. Make sure that Logstash is running and that there is no firewall blocking the traffic.
No connection could be made because the target machine actively refused it
editA firewall is refusing the connection. Check if a firewall is blocking the traffic on the client, the network, or the destination host.