This documentation contains work-in-progress information for future Elastic Stack and Cloud releases. Use the version selector to view supported release docs. It also contains some Elastic Cloud serverless information. Check out our serverless docs for more details.
Character filters reference
editCharacter filters reference
editCharacter filters are used to preprocess the stream of characters before it is passed to the tokenizer.
A character filter receives the original text as a stream of characters and
can transform the stream by adding, removing, or changing characters. For
instance, a character filter could be used to convert Hindu-Arabic numerals
(٠١٢٣٤٥٦٧٨٩) into their Arabic-Latin equivalents (0123456789), or to strip HTML
elements like <b>
from the stream.
Elasticsearch has a number of built in character filters which can be used to build custom analyzers.
- HTML Strip Character Filter
-
The
html_strip
character filter strips out HTML elements like<b>
and decodes HTML entities like&
. - Mapping Character Filter
-
The
mapping
character filter replaces any occurrences of the specified strings with the specified replacements. - Pattern Replace Character Filter
-
The
pattern_replace
character filter replaces any characters matching a regular expression with the specified replacement.