WARNING: Version 6.1 of Elasticsearch has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
Analyzers
editAnalyzers
editElasticsearch ships with a wide range of built-in analyzers, which can be used in any index without further configuration:
- Standard Analyzer
-
The
standard
analyzer divides text into terms on word boundaries, as defined by the Unicode Text Segmentation algorithm. It removes most punctuation, lowercases terms, and supports removing stop words. - Simple Analyzer
-
The
simple
analyzer divides text into terms whenever it encounters a character which is not a letter. It lowercases all terms. - Whitespace Analyzer
-
The
whitespace
analyzer divides text into terms whenever it encounters any whitespace character. It does not lowercase terms. - Stop Analyzer
-
The
stop
analyzer is like thesimple
analyzer, but also supports removal of stop words. - Keyword Analyzer
-
The
keyword
analyzer is a “noop” analyzer that accepts whatever text it is given and outputs the exact same text as a single term. - Pattern Analyzer
-
The
pattern
analyzer uses a regular expression to split the text into terms. It supports lower-casing and stop words. - Language Analyzers
-
Elasticsearch provides many language-specific analyzers like
english
orfrench
. - Fingerprint Analyzer
-
The
fingerprint
analyzer is a specialist analyzer which creates a fingerprint which can be used for duplicate detection.
Custom analyzers
editIf you do not find an analyzer suitable for your needs, you can create a
custom
analyzer which combines the appropriate
character filters,
tokenizer, and token filters.