Dissect filter plugin
editDissect filter plugin
edit- Plugin version: v1.2.4
- Released on: 2022-01-05
- Changelog
For other versions, see the Versioned plugin docs.
Getting Help
editFor questions about the plugin, open a topic in the Discuss forums. For bugs or feature requests, open an issue in Github. For the list of Elastic supported plugins, please consult the Elastic Support Matrix.
Description
editThe Dissect filter plugin tokenizes incoming strings using defined patterns. It extracts unstructured event data into fields using delimiters. This process is called tokenization.
Unlike a regular split operation where one delimiter is applied to the whole string, the Dissect operation applies a set of delimiters to a string value.
All keys must be found and extracted for tokenization to be successful. If one or more keys cannot be found, an error occurs and the original event is not modified.
Dissect or Grok? Or both?
editDissect differs from Grok in that it does not use regular expressions and is faster. Dissect works well when data is reliably repeated. Grok is a better choice when the structure of your text varies from line to line.
You can use both Dissect and Grok for a hybrid use case when a section of the line is reliably repeated, but the entire line is not. The Dissect filter can deconstruct the section of the line that is repeated. The Grok filter can process the remaining field values with more regex predictability.
Terminology
editdissect pattern - the set of fields and delimiters describing the textual
format. Also known as a dissection.
The dissection is described using a set of %{}
sections:
%{a} - %{b} - %{c}
field - the text from %{
to }
inclusive.
delimiter - the text between }
and the next %{
characters.
Any set of characters other than %{
, 'not }'
, or }
is a delimiter.
key - the text between the %{
and }
, exclusive of the ?
, +
, &
prefixes
and the ordinal suffix.
Examples:
%{?aaa}
- the key is aaa
%{+bbb/3}
- the key is bbb
%{&ccc}
- the key is ccc
Using the .
(dot) as key
will generate fields with .
in the field name. If
you want to get nested fields, use the brackets notation such as %{[fieldname][subfieldname]}
.
Sample configuration
editThe config might look like this:
filter { dissect { mapping => { "message" => "%{ts} %{+ts} %{+ts} %{src} %{} %{prog}[%{pid}]: %{msg}" } } }
When a string is dissected from left to right, text is captured up to the first delimiter. The captured text is stored in the first field. This is repeated for each field/# delimiter pair until the last delimiter is reached. Then the remaining text is stored in the last field.
Notations
editNotes and usage guidelines
edit- For append or indirect fields, the key can refer to a field that already exists in the event before dissection.
-
Use a Skip field if you do not want the indirection key/value stored.
Example:
%{?a}: %{&a}
applied to textgoogle: 77.98
will build a key/value ofgoogle => 77.98
. -
Append and indirect cannot be combined.
Examples:
%{+&something}
will add a value to the&something
key (probably not the intended outcome).%{&+something}
will add a value to the+something
key (again probably unintended).
Normal field notation
editThe found value is added to the Event using the key. A normal field has no prefix or suffix.
Example:
%{some_field}
Skip field notation
editThe found value is stored internally, but is not added to the Event.
The key, if supplied, is prefixed with a ?
.
Examples:
%{}
is an empty skip field.
%{?foo}
is a named skip field.
Append field notation
editIf the value is the first field seen, it is stored. Subsequent fields are appended to another value.
The key is prefixed with a +
.
The final value is stored in the Event using the key.
The delimiter found before the field is appended with the value. If no delimiter is found before the field, a single space character is used.
Examples:
%{+some_field}
is an append field.
%{+some_field/2}
is an append field with an order modifier.
Order modifiers
An order modifier, /digits
, allows one to reorder the append sequence.
Example:
For text 1 2 3 go
, this %{+a/2} %{+a/1} %{+a/4} %{+a/3}
will build a key/value of a => 2 1 go 3
.
Append fields without an order modifier will append in declared order.
Example:
For text 1 2 3 go
, this %{a} %{b} %{+a}
will build two key/values of a => 1 3 go, b => 2
Indirect field notation
editThe found value is added to the Event using the found value of another field as the key.
The key is prefixed with a &
.
Examples:
%{&some_field}
is an indirect field where the key is indirectly sourced from the value of some_field
.
For text error: some_error, some_description
, this notation error: %{?err}, %{&err}
will build a key/value of some_error => some_description
.
Multiple Consecutive Delimiter Handling
editMultiple found delimiter handling has changed starting with version 1.1.1 of this plugin. Now multiple consecutive delimiters are seen as missing fields by default and not padding. If you are already using Dissect and your source text has fields padded with extra delimiters, you will need to change your config. Please read the section below.
Empty data between delimiters
editGiven this text as the sample used to create a dissection:
John Smith,Big Oaks,Wood Lane,Hambledown,Canterbury,CB34RY
The created dissection, with 6 fields, is:
%{name},%{addr1},%{addr2},%{addr3},%{city},%{zip}
When a line like this is processed:
Jane Doe,4321 Fifth Avenue,,,New York,87432
Dissect will create an event with empty fields for addr2 and addr3
like so:
{ "name": "Jane Doe", "addr1": "4321 Fifth Avenue", "addr2": "", "addr3": "", "city": "New York" "zip": "87432" }
Delimiters used as padding to visually align fields
editPadding to the right hand side
Given these texts as the samples used to create a dissection:
00000043 ViewReceive machine-321 f3000a3b Calc machine-123
The dissection, with 3 fields, is:
%{id} %{function->} %{server}
Note, above, the second field has a ->
suffix which tells Dissect to ignore padding to its right.
Dissect will create these events:
{ "id": "00000043", "function": "ViewReceive", "server": "machine-123" } { "id": "f3000a3b", "function": "Calc", "server": "machine-321" }
Always add the ->
suffix to the field on the left of the padding.
Padding to the left hand side (to the human eye)
Given these texts as the samples used to create a dissection:
00000043 ViewReceive machine-321 f3000a3b Calc machine-123
The dissection, with 3 fields, is now:
%{id->} %{function} %{server}
Here the ->
suffix moves to the id
field because Dissect sees the padding as being to the right of the id
field.
Conditional processing
editYou probably want to use this filter inside an if
block.
This ensures that the event contains a field value with a suitable structure for the dissection.
Example:
filter { if [type] == "syslog" or "syslog" in [tags] { dissect { mapping => { "message" => "%{ts} %{+ts} %{+ts} %{src} %{} %{prog}[%{pid}]: %{msg}" } } } }
Dissect Filter Configuration Options
editThis plugin supports the following configuration options plus the Common Options described later.
Setting | Input type | Required |
---|---|---|
No |
||
No |
||
No |
Also see Common Options for a list of options supported by all filter plugins.
convert_datatype
edit- Value type is hash
-
Default value is
{}
With this setting int
and float
datatype conversions can be specified.
These will be done after all mapping
dissections have taken place.
Feel free to use this setting on its own without a mapping
section.
Example
filter { dissect { convert_datatype => { "cpu" => "float" "code" => "int" } } }
mapping
edit- Value type is hash
-
Default value is
{}
A hash of dissections of field => value
Don’t use an escaped newline \n
in the value. It will be interpreted as two characters \
+ n
.
Instead use actual line breaks in the config.
Also use single quotes to define the value if it contains double quotes.
A later dissection can be done on values from a previous dissection or they can be independent.
Example
filter { dissect { mapping => { # using an actual line break "message" => '"%{field1}" "%{field2}" "%{description}"' "description" => "%{field3} %{field4} %{field5}" } } }
This is useful if you want to keep the field description
but also
dissect it further.
Common Options
editThe following configuration options are supported by all filter plugins:
Setting | Input type | Required |
---|---|---|
No |
||
No |
||
No |
||
No |
||
No |
||
No |
||
No |
add_field
edit- Value type is hash
-
Default value is
{}
If this filter is successful, add any arbitrary fields to this event.
Field names can be dynamic and include parts of the event using the %{field}
.
Example:
filter { dissect { add_field => { "foo_%{somefield}" => "Hello world, from %{host}" } } }
# You can also add multiple fields at once: filter { dissect { add_field => { "foo_%{somefield}" => "Hello world, from %{host}" "new_field" => "new_static_value" } } }
If the event has field "somefield" == "hello"
this filter, on success,
would add field foo_hello
if it is present, with the
value above and the %{host}
piece replaced with that value from the
event. The second example would also add a hardcoded field.
add_tag
edit- Value type is array
-
Default value is
[]
If this filter is successful, add arbitrary tags to the event.
Tags can be dynamic and include parts of the event using the %{field}
syntax.
Example:
filter { dissect { add_tag => [ "foo_%{somefield}" ] } }
# You can also add multiple tags at once: filter { dissect { add_tag => [ "foo_%{somefield}", "taggedy_tag"] } }
If the event has field "somefield" == "hello"
this filter, on success,
would add a tag foo_hello
(and the second example would of course add a taggedy_tag
tag).
enable_metric
edit- Value type is boolean
-
Default value is
true
Disable or enable metric logging for this specific plugin instance. By default we record all the metrics we can, but you can disable metrics collection for a specific plugin.
id
edit- Value type is string
- There is no default value for this setting.
Add a unique ID
to the plugin configuration. If no ID is specified, Logstash will generate one.
It is strongly recommended to set this ID in your configuration. This is particularly useful
when you have two or more plugins of the same type, for example, if you have 2 dissect filters.
Adding a named ID in this case will help in monitoring Logstash when using the monitoring APIs.
filter { dissect { id => "ABC" } }
Variable substitution in the id
field only supports environment variables
and does not support the use of values from the secret store.
periodic_flush
edit- Value type is boolean
-
Default value is
false
Call the filter flush method at regular interval. Optional.
remove_field
edit- Value type is array
-
Default value is
[]
If this filter is successful, remove arbitrary fields from this event. Fields names can be dynamic and include parts of the event using the %{field} Example:
filter { dissect { remove_field => [ "foo_%{somefield}" ] } }
# You can also remove multiple fields at once: filter { dissect { remove_field => [ "foo_%{somefield}", "my_extraneous_field" ] } }
If the event has field "somefield" == "hello"
this filter, on success,
would remove the field with name foo_hello
if it is present. The second
example would remove an additional, non-dynamic field.
remove_tag
edit- Value type is array
-
Default value is
[]
If this filter is successful, remove arbitrary tags from the event.
Tags can be dynamic and include parts of the event using the %{field}
syntax.
Example:
filter { dissect { remove_tag => [ "foo_%{somefield}" ] } }
# You can also remove multiple tags at once: filter { dissect { remove_tag => [ "foo_%{somefield}", "sad_unwanted_tag"] } }
If the event has field "somefield" == "hello"
this filter, on success,
would remove the tag foo_hello
if it is present. The second example
would remove a sad, unwanted tag as well.