NOTE: You are looking at documentation for an older release. For the latest information, see the current release documentation.

« Letter Tokenizer Whitespace Tokenizer »

› › ›

Lowercase Tokenizer

edit

Lowercase Tokenizer

edit

The lowercase tokenizer, like the letter tokenizer breaks text into terms whenever it encounters a character which is not a letter, but it also lowercases all terms. It is functionally equivalent to the letter tokenizer combined with the lowercase token filter, but is more efficient as it performs both steps in a single pass.

Example output

edit

POST _analyze
{
  "tokenizer": "lowercase",
  "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
}

Copy as curl Try in Elastic

The above sentence would produce the following terms:

[ the, quick, brown, foxes, jumped, over, the, lazy, dog, s, bone ]

Configuration

edit

The lowercase tokenizer is not configurable.

« Letter Tokenizer Whitespace Tokenizer »

Was this helpful?

Feedback

The Search AI Company

ELK Stack

Elastic Cloud

Generative AI

Search

Security

Observability

By solution

Industries

Customer spotlight

Research

Build

Learn

Connect

Lowercase Tokenizer

Lowercase Tokenizer

Example output

Configuration

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards