Managing Multiline Events

edit

Several use cases generate events that span multiple lines of text. In order to correctly handle these multline events, Logstash needs to know how to tell which lines are part of a single event.

Multiline event processing is complex and relies on proper event ordering. The best way to guarantee ordered log processing is to implement the processing as early in the pipeline as possible. The preferred tool in the Logstash pipeline is the multiline codec, which merges lines from a single input using a simple set of rules.

For more complex needs, the multiline filter performs a similar task at the filter stage of processing, where the Logstash instance aggregates multiple inputs.

The most important aspects of configuring either multiline plugin are the following:

  • The pattern option specifies a regular expression. Lines that match the specified regular expression are considered either continuations of a previous line or the start of a new multiline event. You can use grok regular expression templates with this configuration option.
  • The what option takes two values: previous or next. The previous value specifies that lines that match the value in the pattern option are part of the previous line. The next value specifies that lines that match the value in the pattern option are part of the following line.* The negate option applies the multiline codec to lines that do not match the regular expression specified in the pattern option.

See the full documentation for the multiline codec or the multiline filter plugin for more information on configuration options.

Multiline Special Cases

edit
  • The current release of the multiline codec plugin treats all input from the lumberjack input plugin as a single stream. When your use case involves the Logstash Forwarder processing multiple files concurrently, proper event ordering can be challenging to maintain, and any resulting errors can be difficult to diagnose. Carefully monitor the output of Logstash configurations that involve multiline processing of multiple files handled by the Logstash Forwarder.
  • The multiline codec plugin does not support file input from files that contain events from multiple sources.
  • The multiline filter plugin is not thread-safe. Avoid using multiple filter workers with the multiline filter.

You can track the progress of upgrades to the functionality of the multiline codec at this Github issue.

Examples of Multiline Plugin Configuration

edit

The examples in this section cover the following use cases:

  • Combining a Java stack trace into a single event
  • Combining C-style line continuations into a single event
  • Combining multiple lines from time-stamped events

Java Stack Traces

edit

Java stack traces consist of multiple lines, with each line after the initial line beginning with whitespace, as in this example:

Exception in thread "main" java.lang.NullPointerException
        at com.example.myproject.Book.getTitle(Book.java:16)
        at com.example.myproject.Author.getBookTitles(Author.java:25)
        at com.example.myproject.Bootstrap.main(Bootstrap.java:14)

To consolidate these lines into a single event in Logstash, use the following configuration for the multiline codec:

input {
  stdin {
    codec => multiline {
      pattern => "^\s"
      what => "previous"
    }
  }
}

This configuration merges any line that begins with whitespace up to the previous line.

Line Continuations

edit

Several programming languages use the \ character at the end of a line to denote that the line continues, as in this example:

printf ("%10.10ld  \t %10.10ld \t %s\
  %f", w, x, y, z );

To consolidate these lines into a single event in Logstash, use the following configuration for the multiline codec:

input {
  stdin {
    codec => multiline {
      pattern => "\\$"
      what => "next"
    }
  }
}

This configuration merges any line that ends with the \ character with the following line.

Timestamps

edit

Activity logs from services such as Elasticsearch typically begin with a timestamp, followed by information on the specific activity, as in this example:

[2015-08-24 11:49:14,389][INFO ][env                      ] [Letha] using [1] data paths, mounts [[/
(/dev/disk1)]], net usable_space [34.5gb], net total_space [118.9gb], types [hfs]

To consolidate these lines into a single event in Logstash, use the following configuration for the multiline codec:

input {
  file {
    path => "/var/log/someapp.log"
    codec => multiline {
      pattern => "^%{TIMESTAMP_ISO8601} "
      negate => true
      what => previous
    }
  }
}

This configuration uses the negate option to specify that any line that does not begin with a timestamp belongs to the previous line.