
The executive guide to generative AI

Read more

ASCII folding token filter


Converts alphabetic, numeric, and symbolic characters that are not in the Basic Latin Unicode block (first 127 ASCII characters) to their ASCII equivalent, if one exists. For example, the filter changes à to a.

This filter uses Lucene’s ASCIIFoldingFilter.



The following analyze API request uses the asciifolding filter to drop the diacritical marks in açaí à la carte:

response = client.indices.analyze(
  body: {
    tokenizer: 'standard',
    filter: [
    text: 'açaí à la carte'
puts response
GET /_analyze
  "tokenizer" : "standard",
  "filter" : ["asciifolding"],
  "text" : "açaí à la carte"

The filter produces the following tokens:

[ acai, a, la, carte ]

Add to an analyzer


The following create index API request uses the asciifolding filter to configure a new custom analyzer.

response = client.indices.create(
  index: 'asciifold_example',
  body: {
    settings: {
      analysis: {
        analyzer: {
          standard_asciifolding: {
            tokenizer: 'standard',
            filter: [
puts response
PUT /asciifold_example
  "settings": {
    "analysis": {
      "analyzer": {
        "standard_asciifolding": {
          "tokenizer": "standard",
          "filter": [ "asciifolding" ]

Configurable parameters

(Optional, Boolean) If true, emit both original tokens and folded tokens. Defaults to false.



To customize the asciifolding filter, duplicate it to create the basis for a new custom token filter. You can modify the filter using its configurable parameters.

For example, the following request creates a custom asciifolding filter with preserve_original set to true:

response = client.indices.create(
  index: 'asciifold_example',
  body: {
    settings: {
      analysis: {
        analyzer: {
          standard_asciifolding: {
            tokenizer: 'standard',
            filter: [
        filter: {
          my_ascii_folding: {
            type: 'asciifolding',
            preserve_original: true
puts response
PUT /asciifold_example
  "settings": {
    "analysis": {
      "analyzer": {
        "standard_asciifolding": {
          "tokenizer": "standard",
          "filter": [ "my_ascii_folding" ]
      "filter": {
        "my_ascii_folding": {
          "type": "asciifolding",
          "preserve_original": true
Was this helpful?