This documentation contains work-in-progress information for future Elastic Stack and Cloud releases. Use the version selector to view supported release docs. It also contains some Elastic Cloud serverless information. Check out our serverless docs for more details.
Lowercase tokenizer
editLowercase tokenizer
editThe lowercase
tokenizer, like the
letter
tokenizer breaks text into terms
whenever it encounters a character which is not a letter, but it also
lowercases all terms. It is functionally equivalent to the
letter
tokenizer combined with the
lowercase
token filter, but is more
efficient as it performs both steps in a single pass.
Example output
editresp = client.indices.analyze( tokenizer="lowercase", text="The 2 QUICK Brown-Foxes jumped over the lazy dog's bone.", ) print(resp)
response = client.indices.analyze( body: { tokenizer: 'lowercase', text: "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone." } ) puts response
const response = await client.indices.analyze({ tokenizer: "lowercase", text: "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone.", }); console.log(response);
POST _analyze { "tokenizer": "lowercase", "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone." }
The above sentence would produce the following terms:
[ the, quick, brown, foxes, jumped, over, the, lazy, dog, s, bone ]
Configuration
editThe lowercase
tokenizer is not configurable.