similarity

edit

Elasticsearch allows you to configure a text scoring algorithm or similarity per field. The similarity setting provides a simple way of choosing a text similarity algorithm other than the default BM25, such as boolean.

Only text-based field types like text and keyword support this configuration.

Custom similarities can be configured by tuning the parameters of the built-in similarities. For more details about this expert options, see the similarity module.

The only similarities which can be used out of the box, without any further configuration are:

BM25
The Okapi BM25 algorithm. The algorithm used by default in Elasticsearch and Lucene.
boolean
A simple boolean similarity, which is used when full-text ranking is not needed and the score should only be based on whether the query terms match or not. Boolean similarity gives terms a score equal to their query boost.

The similarity can be set on the field level when a field is first created, as follows:

resp = client.indices.create(
    index="my-index-000001",
    mappings={
        "properties": {
            "default_field": {
                "type": "text"
            },
            "boolean_sim_field": {
                "type": "text",
                "similarity": "boolean"
            }
        }
    },
)
print(resp)
response = client.indices.create(
  index: 'my-index-000001',
  body: {
    mappings: {
      properties: {
        default_field: {
          type: 'text'
        },
        boolean_sim_field: {
          type: 'text',
          similarity: 'boolean'
        }
      }
    }
  }
)
puts response
const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
    properties: {
      default_field: {
        type: "text",
      },
      boolean_sim_field: {
        type: "text",
        similarity: "boolean",
      },
    },
  },
});
console.log(response);
PUT my-index-000001
{
  "mappings": {
    "properties": {
      "default_field": { 
        "type": "text"
      },
      "boolean_sim_field": {
        "type": "text",
        "similarity": "boolean" 
      }
    }
  }
}

The default_field uses the BM25 similarity.

The boolean_sim_field uses the boolean similarity.