Extracting Fields and Wrangling Data
editExtracting Fields and Wrangling Data
editThe plugins described in this section are useful for extracting fields and parsing unstructured data into fields.
- dissect filter
-
Extracts unstructured event data into fields by using delimiters. The dissect filter does not use regular expressions and is very fast. However, if the structure of the data varies from line to line, the grok filter is more suitable.
For example, let’s say you have a log that contains the following message:
Apr 26 12:20:02 localhost systemd[1]: Starting system activity accounting tool...
The following config dissects the message:
filter { dissect { mapping => { "message" => "%{ts} %{+ts} %{+ts} %{src} %{prog}[%{pid}]: %{msg}" } } }
After the dissect filter is applied, the event will be dissected into the following fields:
{ "msg" => "Starting system activity accounting tool...", "@timestamp" => 2017-04-26T19:33:39.257Z, "src" => "localhost", "@version" => "1", "host" => "localhost.localdomain", "pid" => "1", "message" => "Apr 26 12:20:02 localhost systemd[1]: Starting system activity accounting tool...", "type" => "stdin", "prog" => "systemd", "ts" => "Apr 26 12:20:02" }
- kv filter
-
Parses key-value pairs.
For example, let’s say you have a log message that contains the following key-value pairs:
ip=1.2.3.4 error=REFUSED
The following config parses the key-value pairs into fields:
filter { kv { } }
After the filter is applied, the event in the example will have these fields:
-
ip: 1.2.3.4
-
error: REFUSED
-
- grok filter
-
Parses unstructured event data into fields. This tool is perfect for syslog logs, Apache and other webserver logs, MySQL logs, and in general, any log format that is generally written for humans and not computer consumption. Grok works by combining text patterns into something that matches your logs.
For example, let’s say you have an HTTP request log that contains the following message:
55.3.244.1 GET /index.html 15824 0.043
The following config parses the message into fields:
filter { grok { match => { "message" => "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" } } }
After the filter is applied, the event in the example will have these fields:
-
client: 55.3.244.1
-
method: GET
-
request: /index.html
-
bytes: 15824
-
duration: 0.043
-