IMPORTANT: No additional bug fixes or documentation updates
will be released for this version. For the latest information, see the
current release documentation.
Uppercase token filter
editUppercase token filter
editChanges token text to uppercase. For example, you can use the uppercase
filter
to change the Lazy DoG
to THE LAZY DOG
.
This filter uses Lucene’s UpperCaseFilter.
Depending on the language, an uppercase character can map to multiple
lowercase characters. Using the uppercase
filter could result in the loss of
lowercase character information.
To avoid this loss but still have a consistent letter case, use the
lowercase
filter instead.
Example
editThe following analyze API request uses the default
uppercase
filter to change the the Quick FoX JUMPs
to uppercase:
response = client.indices.analyze( body: { tokenizer: 'standard', filter: [ 'uppercase' ], text: 'the Quick FoX JUMPs' } ) puts response
GET _analyze { "tokenizer" : "standard", "filter" : ["uppercase"], "text" : "the Quick FoX JUMPs" }
The filter produces the following tokens:
[ THE, QUICK, FOX, JUMPS ]
Add to an analyzer
editThe following create index API request uses the
uppercase
filter to configure a new
custom analyzer.
response = client.indices.create( index: 'uppercase_example', body: { settings: { analysis: { analyzer: { whitespace_uppercase: { tokenizer: 'whitespace', filter: [ 'uppercase' ] } } } } } ) puts response
PUT uppercase_example { "settings": { "analysis": { "analyzer": { "whitespace_uppercase": { "tokenizer": "whitespace", "filter": [ "uppercase" ] } } } } }