Logging integration

edit

Many applications use logging frameworks to help record, format, and append an application’s logs. Elastic APM now offers a way to make your application logs even more useful, by integrating with the most popular logging frameworks in their respective languages. This means you can easily inject trace information into your logs, allowing you to explore logs in the Logs app, then jump straight into the corresponding APM traces — all while preserving the trace context.

To get started:

  1. Enable log correlation
  2. Add APM identifiers to your logs
  3. Ingest your logs into Elasticsearch

Enable Log correlation

edit

Some Agents require you to first enable log correlation in the Agent. This is done with a configuration variable, and is different for each Agent. See the relevant Agent documentation for further information.

Add APM identifiers to your logs

edit

Once log correlation is enabled, you must ensure your logs contain APM identifiers. In some supported frameworks, this is already done for you. In other scenarios, like for unstructured logs, you’ll need to add APM identifiers to your logs in any easy to parse manner.

The identifiers we’re interested in are: trace.id and transaction.id. Certain Agents also support the span.id field.

This process for adding these fields will differ based the Agent you’re using, the logging framework, and the type and structure of your logs.

See the relevant Agent documentation to learn more.

Ingest your logs into Elasticsearch

edit

Once your logs contain the appropriate identifiers (fields), you need to ingest them into Elasticsearch. Luckily, we’ve got a tool for that — Filebeat is Elastic’s log shipper. The Filebeat quick start guide will walk you through the setup process.

Because logging frameworks and formats vary greatly between different programming languages, there is no one-size-fits-all approach for ingesting your logs into Elasticsearch. The following tips should hopefully get you going in the right direction:

Download Filebeat

There are many ways to download and get started with Filebeat. Read the Filebeat quick start guide to determine which is best for you.

Configure Filebeat

Modify the filebeat.yml configuration file to your needs. Here are some recommendations:

  • Set filebeat.inputs to point to the source of your logs
  • Point Filebeat to the same Elastic Stack that is receiving your APM data
  • If you’re using Elastic cloud, set cloud.id and cloud.auth.
  • If your using a manual setup, use output.elasticsearch.hosts.
filebeat.inputs:
- type: log 
  paths: 
    - /var/log/*.log
cloud.id: "staging:dXMtZWFzdC0xLmF3cy5mb3VuZC5pbyRjZWMNjN2Q3YTllOTYyNTc0Mw==" 
cloud.auth: "elastic:YOUR_PASSWORD" 

Configures the log input

Path(s) that must be crawled to fetch the log lines

Used to resolve the Elasticsearch and Kibana URLs for Elastic Cloud

Authorization token for Elastic Cloud

JSON logs

For JSON logs you can use the log input to read lines from log files. Here’s what a sample configuration might look like:

filebeat.inputs:
  json.keys_under_root: true 
  json.add_error_key: true 
  json.message_key: message 

true copies JSON keys to the top level in the output document

Tells Filebeat to add an error.message and error.type: json key in case of JSON unmarshalling errors

Specifies the JSON key on which to apply line filtering and multiline settings

Parsing unstructured logs

Consider the following log that is decorated with the transaction.id and trace.id fields:

2019-09-18 21:29:49,525 - django.server - ERROR - "GET / HTTP/1.1" 500 27 | elasticapm transaction.id=fcfbbe447b9b6b5a trace.id=f965f4cc5b59bdc62ae349004eece70c span.id=None

All that’s needed now is an ingest node processor to pre-process your logs and extract these structured fields before they are indexed in Elasticsearch. To do this, you’d need to create a pipeline that uses Elasticsearch’s Grok Processor. Here’s an example:

PUT _ingest/pipeline/log-correlation
{
  "description": "Parses the log correlation IDs out of the raw plain-text log",
  "processors": [
    {
      "grok": {
        "field": "message", 
        "patterns": ["%{GREEDYDATA:message} | elasticapm transaction.id=%{DATA:transaction.id} trace.id=%{DATA:trace.id} span.id=%{DATA:span.id}"] 
      }
    }
  ]
}

The field to use for grok expression parsing

An ordered list of grok expression to match and extract named captures with: %{DATA:transaction.id} captures the value of transaction.id, %{DATA:trace.id} captures the value or trace.id, and %{DATA:span.id} captures the value of span.id.

Depending on how you’ve added APM data to your logs, you may need to tweak this grok pattern in order to work for your setup. In addition, it’s possible to extract more structure out of your logs. Make sure to follow the Elastic Common Schema when defining which fields you are storing in Elasticsearch.

Then, configure Filebeat to use the processor in filebeat.yml:

output.elasticsearch:
  pipeline: "log-correlation"

If your logs contain messages that span multiple lines of text (common in Java stack traces), you’ll also need to configure multiline settings.

The following example shows how to configure Filebeat to handle a multiline message where the first line of the message begins with a bracket ([).

multiline.pattern: '^\['
multiline.negate: true
multiline.match: after