Using the murmur3 field
editUsing the murmur3
field
editThe murmur3
is typically used within a multi-field, so that both the original
value and its hash are stored in the index:
PUT my_index { "mappings": { "_doc": { "properties": { "my_field": { "type": "keyword", "fields": { "hash": { "type": "murmur3" } } } } } } }
Such a mapping would allow to refer to my_field.hash
in order to get hashes
of the values of the my_field
field. This is only useful in order to run
cardinality
aggregations:
# Example documents PUT my_index/_doc/1 { "my_field": "This is a document" } PUT my_index/_doc/2 { "my_field": "This is another document" } GET my_index/_search { "aggs": { "my_field_cardinality": { "cardinality": { "field": "my_field.hash" } } } }
Running a cardinality
aggregation on the my_field
field directly would
yield the same result, however using my_field.hash
instead might result in
a speed-up if the field has a high-cardinality. On the other hand, it is
discouraged to use the murmur3
field on numeric fields and string fields
that are not almost unique as the use of a murmur3
field is unlikely to
bring significant speed-ups, while increasing the amount of disk space required
to store the index.