WARNING: Version 2.0 of Elasticsearch has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
Classic Tokenizer
editClassic Tokenizer
editA tokenizer of type classic
providing grammar based tokenizer that is
a good tokenizer for English language documents. This tokenizer has
heuristics for special treatment of acronyms, company names, email addresses,
and internet host names. However, these rules don’t always work, and
the tokenizer doesn’t work well for most languages other than English.
The following are settings that can be set for a classic
tokenizer
type:
Setting | Description |
---|---|
|
The maximum token length. If a token is seen that
exceeds this length then it is discarded. Defaults to |