IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

« Fingerprint token filter Hunspell token filter »

› › ›

Flatten graph token filter

edit

Flatten graph token filter

edit

Flattens a token graph produced by a graph token filter, such as synonym_graph or word_delimiter_graph.

Flattening a token graph containing multi-position tokens makes the graph suitable for indexing. Otherwise, indexing does not support token graphs containing multi-position tokens.

Flattening graphs is a lossy process.

If possible, avoid using the flatten_graph filter. Instead, use graph token filters in search analyzers only. This eliminates the need for the flatten_graph filter.

The flatten_graph filter uses Lucene’s FlattenGraphFilter.

Example

edit

To see how the flatten_graph filter works, you first need to produce a token graph containing multi-position tokens.

The following analyze API request uses the synonym_graph filter to add dns as a multi-position synonym for domain name system in the text domain name system is fragile:

response = client.indices.analyze(
  body: {
    tokenizer: 'standard',
    filter: [
      {
        type: 'synonym_graph',
        synonyms: [
          'dns, domain name system'
        ]
      }
    ],
    text: 'domain name system is fragile'
  }
)
puts response

GET /_analyze
{
  "tokenizer": "standard",
  "filter": [
    {
      "type": "synonym_graph",
      "synonyms": [ "dns, domain name system" ]
    }
  ],
  "text": "domain name system is fragile"
}

Copy as curl Try in Elastic

The filter produces the following token graph with dns as a multi-position token.

Indexing does not support token graphs containing multi-position tokens. To make this token graph suitable for indexing, it needs to be flattened.

To flatten the token graph, add the flatten_graph filter after the synonym_graph filter in the previous analyze API request.

response = client.indices.analyze(
  body: {
    tokenizer: 'standard',
    filter: [
      {
        type: 'synonym_graph',
        synonyms: [
          'dns, domain name system'
        ]
      },
      'flatten_graph'
    ],
    text: 'domain name system is fragile'
  }
)
puts response

GET /_analyze
{
  "tokenizer": "standard",
  "filter": [
    {
      "type": "synonym_graph",
      "synonyms": [ "dns, domain name system" ]
    },
    "flatten_graph"
  ],
  "text": "domain name system is fragile"
}

Copy as curl Try in Elastic

The filter produces the following flattened token graph, which is suitable for indexing.

Add to an analyzer

edit

The following create index API request uses the flatten_graph token filter to configure a new custom analyzer.

In this analyzer, a custom word_delimiter_graph filter produces token graphs containing catenated, multi-position tokens. The flatten_graph filter flattens these token graphs, making them suitable for indexing.

PUT /my-index-000001
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_custom_index_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "my_custom_word_delimiter_graph_filter",
            "flatten_graph"
          ]
        }
      },
      "filter": {
        "my_custom_word_delimiter_graph_filter": {
          "type": "word_delimiter_graph",
          "catenate_all": true
        }
      }
    }
  }
}

Copy as curl Try in Elastic

« Fingerprint token filter Hunspell token filter »

Was this helpful?

Feedback

The Search AI Company

ELK Stack

Elastic Cloud

Generative AI

Search

Security

Observability

By solution

Industries

Customer spotlight

Research

Build

Learn

Connect

Flatten graph token filter

Flatten graph token filter

Example

Add to an analyzer

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards