Unique token filter
editUnique token filter
editRemoves duplicate tokens from a stream. For example, you can use the unique
filter to change the lazy lazy dog
to the lazy dog
.
If the only_on_same_position
parameter is set to true
, the unique
filter
removes only duplicate tokens in the same position.
When only_on_same_position
is true
, the unique
filter works the same as
remove_duplicates
filter.
Example
editThe following analyze API request uses the unique
filter
to remove duplicate tokens from the quick fox jumps the lazy fox
:
GET _analyze { "tokenizer" : "whitespace", "filter" : ["unique"], "text" : "the quick fox jumps the lazy fox" }
The filter removes duplicated tokens for the
and fox
, producing the
following output:
[ the, quick, fox, jumps, lazy ]
Add to an analyzer
editThe following create index API request uses the
unique
filter to configure a new custom analyzer.
PUT custom_unique_example { "settings" : { "analysis" : { "analyzer" : { "standard_truncate" : { "tokenizer" : "standard", "filter" : ["unique"] } } } } }
Configurable parameters
edit-
only_on_same_position
-
(Optional, Boolean)
If
true
, only remove duplicate tokens in the same position. Defaults tofalse
.
Customize
editTo customize the unique
filter, duplicate it to create the basis
for a new custom token filter. You can modify the filter using its configurable
parameters.
For example, the following request creates a custom unique
filter with
only_on_same_position
set to true
.
PUT letter_unique_pos_example { "settings": { "analysis": { "analyzer": { "letter_unique_pos": { "tokenizer": "letter", "filter": [ "unique_pos" ] } }, "filter": { "unique_pos": { "type": "unique", "only_on_same_position": true } } } } }