WARNING: Version 2.0 of Elasticsearch has passed its EOL date.

This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.

« search_analyzer store »

› › ›

similarity

edit

`similarity`

edit

Elasticsearch allows you to configure a scoring algorithm or similarity per field. The similarity setting provides a simple way of choosing a similarity algorithm other than the default TF/IDF, such as BM25.

Similarities are mostly useful for string fields, especially analyzed string fields, but can also apply to other field types.

Custom similarites can be configured by tuning the parameters of the built-in similarities. For more details about this expert options, see the similarity module.

The only similarities which can be used out of the box, without any further configuration are:

default: The Default TF/IDF algorithm used by Elasticsearch and Lucene. See Lucene’s Practical Scoring Function for more information.
BM25: The Okapi BM25 algorithm. See Plugggable Similarity Algorithms for more information.

The similarity can be set on the field level when a field is first created, as follows:

PUT my_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "default_field": { 
          "type": "string"
        },
        "bm25_field": {
          "type": "string",
          "similarity": "BM25" 
        }
      }
    }
  }
}

Copy as curl View in Sense

	The `default_field` uses the `default` similarity (ie TF/IDF).
	The `bm25_field` uses the `BM25` similarity.

« search_analyzer store »

Was this helpful?

Feedback

The Search AI Company

ELK Stack

Elastic Cloud

Generative AI

Search

Security

Observability

By solution

Industries

Customer spotlight

Research

Build

Learn

Connect

similarity

`similarity`

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards