Fingerprint filter plugin
editFingerprint filter plugin
edit- Plugin version: v3.2.4
- Released on: 2021-05-14
- Changelog
For other versions, see the Versioned plugin docs.
Getting Help
editFor questions about the plugin, open a topic in the Discuss forums. For bugs or feature requests, open an issue in Github. For the list of Elastic supported plugins, please consult the Elastic Support Matrix.
Description
editCreate consistent hashes (fingerprints) of one or more fields and store the result in a new field.
You can use this plugin to create consistent document ids when events are inserted into Elasticsearch. This approach means that existing documents can be updated instead of creating new documents.
When the target
option is set to UUID
the result won’t be
a consistent hash but a random
UUID.
To generate UUIDs, prefer the uuid filter.
Fingerprint Filter Configuration Options
editThis plugin supports the following configuration options plus the Common Options described later.
Setting | Input type | Required |
---|---|---|
No |
||
No |
||
No |
||
No |
||
string, one of |
Yes |
|
No |
||
No |
Also see Common Options for a list of options supported by all filter plugins.
base64encode
edit- Value type is boolean
-
Default value is
false
When set to true
, the SHA1
, SHA256
, SHA384
, SHA512
and MD5
fingerprint methods will produce
base64 encoded rather than hex encoded strings.
concatenate_sources
edit- Value type is boolean
-
Default value is
false
When set to true
and method
isn’t UUID
or PUNCTUATION
, the
plugin concatenates the names and values of all fields given in the
source
option into one string (like the old checksum filter) before
doing the fingerprint computation.
If false
and multiple source fields are given, the target field will be single
fingerprint of the last source field.
Example: concatenate_sources
=false
This example produces a single fingerprint that is computed from "birthday," the last source field.
fingerprint { source => ["user_id", "siblings", "birthday"] }
The output is:
"fingerprint" => "6b6390a4416131f82b6ffb509f6e779e5dd9630f".
Example: concatenate_sources
=false with array
If the last source field is an array, you get an array of fingerprints.
In this example, "siblings" is an array ["big brother", "little sister", "little brother"].
fingerprint { source => ["user_id", "siblings"] }
The output is:
"fingerprint" => [ [0] "8a8a9323677f4095fcf0c8c30b091a0133b00641", [1] "2ce11b313402e0e9884e094409f8d9fcf01337c2", [2] "adc0b90f9391a82098c7b99e66a816e9619ad0a7" ],
concatenate_all_fields
edit- Value type is boolean
-
Default value is
false
When set to true
and method
isn’t UUID
or PUNCTUATION
, the
plugin concatenates the names and values of all fields of the event
into one string (like the old checksum filter) before doing the
fingerprint computation. If false
and at least one source field is
given, the target field will be an array with fingerprints of the
source fields given.
key
edit- Value type is string
- There is no default value for this setting.
When used with the IPV4_NETWORK
method fill in the subnet prefix length.
With other methods, optionally fill in the HMAC key.
method
edit- This is a required setting.
-
Value can be any of:
SHA1
,SHA256
,SHA384
,SHA512
,MD5
,MURMUR3
,IPV4_NETWORK
,UUID
,PUNCTUATION
-
Default value is
"SHA1"
The fingerprint method to use.
If set to SHA1
, SHA256
, SHA384
, SHA512
, or MD5
and a key is set,
the cryptographic hash function with the same name will be used to generate
the fingerprint. When a key set, the keyed-hash (HMAC) digest function will
be used.
If set to MURMUR3
the non-cryptographic 64 bit MurmurHash function will be used.
If set to IPV4_NETWORK
the input data needs to be a IPv4 address and
the hash value will be the masked-out address using the number of bits
specified in the key
option. For example, with "1.2.3.4" as the input
and key
set to 16, the hash becomes "1.2.0.0".
If set to PUNCTUATION
, all non-punctuation characters will be removed
from the input string.
If set to UUID
, a
UUID will
be generated. The result will be random and thus not a consistent hash.
Common Options
editThe following configuration options are supported by all filter plugins:
Setting | Input type | Required |
---|---|---|
No |
||
No |
||
No |
||
No |
||
No |
||
No |
||
No |
add_field
edit- Value type is hash
-
Default value is
{}
If this filter is successful, add any arbitrary fields to this event.
Field names can be dynamic and include parts of the event using the %{field}
.
Example:
filter { fingerprint { add_field => { "foo_%{somefield}" => "Hello world, from %{host}" } } }
# You can also add multiple fields at once: filter { fingerprint { add_field => { "foo_%{somefield}" => "Hello world, from %{host}" "new_field" => "new_static_value" } } }
If the event has field "somefield" == "hello"
this filter, on success,
would add field foo_hello
if it is present, with the
value above and the %{host}
piece replaced with that value from the
event. The second example would also add a hardcoded field.
add_tag
edit- Value type is array
-
Default value is
[]
If this filter is successful, add arbitrary tags to the event.
Tags can be dynamic and include parts of the event using the %{field}
syntax.
Example:
filter { fingerprint { add_tag => [ "foo_%{somefield}" ] } }
# You can also add multiple tags at once: filter { fingerprint { add_tag => [ "foo_%{somefield}", "taggedy_tag"] } }
If the event has field "somefield" == "hello"
this filter, on success,
would add a tag foo_hello
(and the second example would of course add a taggedy_tag
tag).
enable_metric
edit- Value type is boolean
-
Default value is
true
Disable or enable metric logging for this specific plugin instance. By default we record all the metrics we can, but you can disable metrics collection for a specific plugin.
id
edit- Value type is string
- There is no default value for this setting.
Add a unique ID
to the plugin configuration. If no ID is specified, Logstash will generate one.
It is strongly recommended to set this ID in your configuration. This is particularly useful
when you have two or more plugins of the same type, for example, if you have 2 fingerprint filters.
Adding a named ID in this case will help in monitoring Logstash when using the monitoring APIs.
filter { fingerprint { id => "ABC" } }
Variable substitution in the id
field only supports environment variables
and does not support the use of values from the secret store.
periodic_flush
edit- Value type is boolean
-
Default value is
false
Call the filter flush method at regular interval. Optional.
remove_field
edit- Value type is array
-
Default value is
[]
If this filter is successful, remove arbitrary fields from this event. Fields names can be dynamic and include parts of the event using the %{field} Example:
filter { fingerprint { remove_field => [ "foo_%{somefield}" ] } }
# You can also remove multiple fields at once: filter { fingerprint { remove_field => [ "foo_%{somefield}", "my_extraneous_field" ] } }
If the event has field "somefield" == "hello"
this filter, on success,
would remove the field with name foo_hello
if it is present. The second
example would remove an additional, non-dynamic field.
remove_tag
edit- Value type is array
-
Default value is
[]
If this filter is successful, remove arbitrary tags from the event.
Tags can be dynamic and include parts of the event using the %{field}
syntax.
Example:
filter { fingerprint { remove_tag => [ "foo_%{somefield}" ] } }
# You can also remove multiple tags at once: filter { fingerprint { remove_tag => [ "foo_%{somefield}", "sad_unwanted_tag"] } }
If the event has field "somefield" == "hello"
this filter, on success,
would remove the tag foo_hello
if it is present. The second example
would remove a sad, unwanted tag as well.