IMPORTANT: No additional bug fixes or documentation updates
will be released for this version. For the latest information, see the
current release documentation.
CJK Width Token Filter
editCJK Width Token Filter
editThe cjk_width
token filter normalizes CJK width differences:
- Folds fullwidth ASCII variants into the equivalent basic Latin
- Folds halfwidth Katakana variants into the equivalent Kana
This token filter can be viewed as a subset of NFKC/NFKD
Unicode normalization. See the analysis-icu
plugin
for full normalization support.