IMPORTANT: No additional bug fixes or documentation updates
will be released for this version. For the latest information, see the
current release documentation.
phonetic token filter
editphonetic
token filter
editThe phonetic
token filter takes the following settings:
-
encoder
-
Which phonetic encoder to use. Accepts
metaphone
(default),double_metaphone
,soundex
,refined_soundex
,caverphone1
,caverphone2
,cologne
,nysiis
,koelnerphonetik
,haasephonetik
,beider_morse
,daitch_mokotoff
. -
replace
-
Whether or not the original token should be replaced by the phonetic
token. Accepts
true
(default) andfalse
. Not supported bybeider_morse
encoding.
PUT phonetic_sample { "settings": { "index": { "analysis": { "analyzer": { "my_analyzer": { "tokenizer": "standard", "filter": [ "lowercase", "my_metaphone" ] } }, "filter": { "my_metaphone": { "type": "phonetic", "encoder": "metaphone", "replace": false } } } } } } GET phonetic_sample/_analyze { "analyzer": "my_analyzer", "text": "Joe Bloggs" }
Double metaphone settings
editIf the double_metaphone
encoder is used, then this additional setting is
supported:
-
max_code_len
-
The maximum length of the emitted metaphone token. Defaults to
4
.
Beider Morse settings
editIf the beider_morse
encoder is used, then these additional settings are
supported:
-
rule_type
-
Whether matching should be
exact
orapprox
(default). -
name_type
-
Whether names are
ashkenazi
,sephardic
, orgeneric
(default). -
languageset
-
An array of languages to check. If not specified, then the language will
be guessed. Accepts:
any
,common
,cyrillic
,english
,french
,german
,hebrew
,hungarian
,polish
,romanian
,russian
,spanish
.