Dissect strings
editDissect strings
editThe dissect
processor tokenizes incoming strings using defined patterns.
Example
edit- dissect: tokenizer: "%{key1} %{key2} %{key3|convert_datatype}" field: "message" target_prefix: "dissect"
For a full example, see Dissect example.
Configuration settings
editElastic Agent processors execute before ingest pipelines, which means that your processor configurations cannot refer to fields that are created by ingest pipelines or Logstash. For more limitations, refer to What are some limitations of using processors?
Name | Required | Default | Description |
---|---|---|---|
|
Yes |
Field used to define the dissection pattern. You can provide an optional convert datatype after the key by using a pipe character ( |
|
|
No |
|
Event field to tokenize. |
|
No |
|
Name of the field where the values will be extracted. When an empty string is defined, the processor creates the keys at the root of the event. When the target key already exists in the event, the processor won’t replace it and log an error; you need to either drop or rename the key before using dissect, or enable the |
|
No |
|
Whether to return an error if the tokenizer fails to match the message field. If |
|
No |
|
Whether to overwrite existing keys. If |
|
No |
|
Enables the trimming of the extracted values. Useful to remove leading and trailing spaces. Possible values are:
|
|
No |
( |
Set of characters to trim from values when |
For tokenization to be successful, all keys must be found and extracted. If a key cannot be found, an error is logged, and no modification is done on the original event.
A key can contain any characters except reserved suffix or prefix modifiers: /
,&
, +
, #
and ?
.
See Conditions for a list of supported conditions.
Dissect example
editFor this example, imagine that an application generates the following messages:
"321 - App01 - WebServer is starting" "321 - App01 - WebServer is up and running" "321 - App01 - WebServer is scaling 2 pods" "789 - App02 - Database will be restarted in 5 minutes" "789 - App02 - Database is up and running" "789 - App02 - Database is refreshing tables"
Use the dissect
processor to split each message into three fields, for example, service.pid
,
service.name
, and service.status
:
- dissect: tokenizer: '"%{service.pid|integer} - %{service.name} - %{service.status}"' field: "message" target_prefix: ""
This configuration produces fields like:
"service": { "pid": 321, "name": "App01", "status": "WebServer is up and running" },
service.name
is an ECS keyword field, which means that you
can use it in Elasticsearch for filtering, sorting, and aggregations.
When possible, use ECS-compatible field names. For more information, see the Elastic Common Schema documentation.