Dissect filter plugin

edit
  • Plugin version: v1.2.0
  • Released on: 2018-06-24
  • Changelog

For other versions, see the Versioned plugin docs.

Getting Help

edit

For questions about the plugin, open a topic in the Discuss forums. For bugs or feature requests, open an issue in Github. For the list of Elastic supported plugins, please consult the Elastic Support Matrix.

Description

edit

The Dissect filter is a kind of split operation. Unlike a regular split operation where one delimiter is applied to the whole string, this operation applies a set of delimiters to a string value.
Dissect does not use regular expressions and is very fast.
However, if the structure of your text varies from line to line then Grok is more suitable.
There is a hybrid case where Dissect can be used to de-structure the section of the line that is reliably repeated and then Grok can be used on the remaining field values with more regex predictability and less overall work to do.

A set of fields and delimiters is called a dissection.

The dissection is described using a set of %{} sections:

%{a} - %{b} - %{c}

A field is the text from %{ to } inclusive.

A delimiter is the text between a } and next %{ characters.

Any set of characters that do not fit %{, 'not }', } pattern is a delimiter.

The config might look like this:

  filter {
    dissect {
      mapping => {
        "message" => "%{ts} %{+ts} %{+ts} %{src} %{} %{prog}[%{pid}]: %{msg}"
      }
    }
  }

When dissecting a string from left to right, text is captured upto the first delimiter - this captured text is stored in the first field. This is repeated for each field/# delimiter pair thereafter until the last delimiter is reached, then the remaining text is stored in the last field.

The Key:
The key is the text between the %{ and }, exclusive of the ?, +, & prefixes and the ordinal suffix.
%{?aaa} - key is aaa
%{+bbb/3} - key is bbb
%{&ccc} - key is ccc

Normal field notation

edit

The found value is added to the Event using the key.
%{some_field} - a normal field has no prefix or suffix

Skip field notation:
The found value is stored internally but not added to the Event.
The key, if supplied, is prefixed with a ?.

%{} is an empty skip field.

%{?foo} is a named skip field.

Append field notation

edit

The value is appended to another value or stored if its the first field seen.
The key is prefixed with a +.
The final value is stored in the Event using the key.

The delimiter found before the field is appended with the value.
If no delimiter is found before the field, a single space character is used.

%{+some_field} is an append field.
%{+some_field/2} is an append field with an order modifier.

An order modifier, /digits, allows one to reorder the append sequence.
e.g. for a text of 1 2 3 go, this %{+a/2} %{+a/1} %{+a/4} %{+a/3} will build a key/value of a => 2 1 go 3
Append fields without an order modifier will append in declared order.
e.g. for a text of 1 2 3 go, this %{a} %{b} %{+a} will build two key/values of a => 1 3 go, b => 2

Indirect field notation

edit

The found value is added to the Event using the found value of another field as the key.
The key is prefixed with a &.
%{&some_field} - an indirect field where the key is indirectly sourced from the value of some_field.
e.g. for a text of error: some_error, some_description, this error: %{?err}, %{&err} will build a key/value of some_error => some_description.

for append and indirect field the key can refer to a field that already exists in the event before dissection.

use a Skip field if you do not want the indirection key/value stored.

e.g. for a text of google: 77.98, this %{?a}: %{&a} will build a key/value of google => 77.98.

append and indirect cannot be combined and will fail validation.
%{+&something} - will add a value to the &something key, probably not the intended outcome.
%{&+something} will add a value to the +something key, again probably unintended.

Multiple Consecutive Delimiter Handling

edit

Starting from version 1.1.1 of this plugin, multiple found delimiter handling has changed. Now multiple consecutive delimiters will be seen as missing fields by default and not padding. If you are already using Dissect and your source text has fields padded with extra delimiters, you will need to change your config. Please read the section below.

Empty data between delimiters

edit

Given this text as the sample used to create a dissection:

John Smith,Big Oaks,Wood Lane,Hambledown,Canterbury,CB34RY

The created dissection, with 6 fields, is:

%{name},%{addr1},%{addr2},%{addr3},%{city},%{zip}

When a line like this is processed:

Jane Doe,4321 Fifth Avenue,,,New York,87432

Dissect will create an event with empty fields for addr2 and addr3 like so:

{
  "name": "Jane Doe",
  "addr1": "4321 Fifth Avenue",
  "addr2": "",
  "addr3": "",
  "city": "New York"
  "zip": "87432"
}

Delimiters used as padding to visually align fields

edit

Padding to the right hand side

Given these texts as the samples used to create a dissection:

00000043 ViewReceive     machine-321
f3000a3b Calc            machine-123

The dissection, with 3 fields, is:

%{id} %{function->} %{server}

Note, above, the second field has a -> suffix which tells Dissect to ignore padding to its right.
Dissect will create these events:

{
  "id": "00000043",
  "function": "ViewReceive",
  "server": "machine-123"
}
{
  "id": "f3000a3b",
  "function": "Calc",
  "server": "machine-321"
}

Always add the -> suffix to the field on the left of the padding.

Padding to the left hand side (to the human eye)

Given these texts as the samples used to create a dissection:

00000043     ViewReceive machine-321
f3000a3b            Calc machine-123

The dissection, with 3 fields, is now:

%{id->} %{function} %{server}

Here the -> suffix moves to the id field because Dissect sees the padding as being to the right of the id field.

Conditional processing

edit

You probably want to use this filter inside an if block.
This ensures that the event contains a field value with a suitable structure for the dissection.

For example…​

filter {
  if [type] == "syslog" or "syslog" in [tags] {
    dissect {
      mapping => {
        "message" => "%{ts} %{+ts} %{+ts} %{src} %{} %{prog}[%{pid}]: %{msg}"
      }
    }
  }
}

Dissect Filter Configuration Options

edit

This plugin supports the following configuration options plus the Common Options described later.

Setting Input type Required

convert_datatype

hash

No

mapping

hash

No

tag_on_failure

array

No

Also see Common Options for a list of options supported by all filter plugins.

 

convert_datatype

edit
  • Value type is hash
  • Default value is {}

With this setting int and float datatype conversions can be specified.
These will be done after all mapping dissections have taken place.
Feel free to use this setting on its own without a mapping section.

For example

filter {
  dissect {
    convert_datatype => {
      "cpu" => "float"
      "code" => "int"
    }
  }
}

mapping

edit
  • Value type is hash
  • Default value is {}

A hash of dissections of field => value

Don’t use an escaped newline \n in the value, it will be seen as two characters \ + n+ Instead use actual line breaks in the config.+ Also use single quotes to define the value if it contains double quotes.

A later dissection can be done on values from a previous dissection or they can be independent.

For example

filter {
  dissect {
    mapping => {
      # using an actual line break
      "message" => '"%{field1}" "%{field2}"
 "%{description}"'
      "description" => "%{field3} %{field4} %{field5}"
    }
  }
}

This is useful if you want to keep the field description but also dissect it some more.

tag_on_failure

edit
  • Value type is array
  • Default value is ["_dissectfailure"]

Append values to the tags field when dissection fails

Common Options

edit

The following configuration options are supported by all filter plugins:

add_field

edit
  • Value type is hash
  • Default value is {}

If this filter is successful, add any arbitrary fields to this event. Field names can be dynamic and include parts of the event using the %{field}.

Example:

    filter {
      dissect {
        add_field => { "foo_%{somefield}" => "Hello world, from %{host}" }
      }
    }
    # You can also add multiple fields at once:
    filter {
      dissect {
        add_field => {
          "foo_%{somefield}" => "Hello world, from %{host}"
          "new_field" => "new_static_value"
        }
      }
    }

If the event has field "somefield" == "hello" this filter, on success, would add field foo_hello if it is present, with the value above and the %{host} piece replaced with that value from the event. The second example would also add a hardcoded field.

add_tag

edit
  • Value type is array
  • Default value is []

If this filter is successful, add arbitrary tags to the event. Tags can be dynamic and include parts of the event using the %{field} syntax.

Example:

    filter {
      dissect {
        add_tag => [ "foo_%{somefield}" ]
      }
    }
    # You can also add multiple tags at once:
    filter {
      dissect {
        add_tag => [ "foo_%{somefield}", "taggedy_tag"]
      }
    }

If the event has field "somefield" == "hello" this filter, on success, would add a tag foo_hello (and the second example would of course add a taggedy_tag tag).

enable_metric

edit
  • Value type is boolean
  • Default value is true

Disable or enable metric logging for this specific plugin instance. By default we record all the metrics we can, but you can disable metrics collection for a specific plugin.

  • Value type is string
  • There is no default value for this setting.

Add a unique ID to the plugin configuration. If no ID is specified, Logstash will generate one. It is strongly recommended to set this ID in your configuration. This is particularly useful when you have two or more plugins of the same type, for example, if you have 2 dissect filters. Adding a named ID in this case will help in monitoring Logstash when using the monitoring APIs.

    filter {
      dissect {
        id => "ABC"
      }
    }

Variable substitution in the id field only supports environment variables and does not support the use of values from the secret store.

periodic_flush

edit
  • Value type is boolean
  • Default value is false

Call the filter flush method at regular interval. Optional.

remove_field

edit
  • Value type is array
  • Default value is []

If this filter is successful, remove arbitrary fields from this event. Fields names can be dynamic and include parts of the event using the %{field} Example:

    filter {
      dissect {
        remove_field => [ "foo_%{somefield}" ]
      }
    }
    # You can also remove multiple fields at once:
    filter {
      dissect {
        remove_field => [ "foo_%{somefield}", "my_extraneous_field" ]
      }
    }

If the event has field "somefield" == "hello" this filter, on success, would remove the field with name foo_hello if it is present. The second example would remove an additional, non-dynamic field.

remove_tag

edit
  • Value type is array
  • Default value is []

If this filter is successful, remove arbitrary tags from the event. Tags can be dynamic and include parts of the event using the %{field} syntax.

Example:

    filter {
      dissect {
        remove_tag => [ "foo_%{somefield}" ]
      }
    }
    # You can also remove multiple tags at once:
    filter {
      dissect {
        remove_tag => [ "foo_%{somefield}", "sad_unwanted_tag"]
      }
    }

If the event has field "somefield" == "hello" this filter, on success, would remove the tag foo_hello if it is present. The second example would remove a sad, unwanted tag as well.