WARNING: Version 6.2 of Elasticsearch has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
similarity
editsimilarity
editElasticsearch allows you to configure a scoring algorithm or similarity per
field. The similarity
setting provides a simple way of choosing a similarity
algorithm other than the default BM25
, such as TF/IDF
.
Similarities are mostly useful for text
fields, but can also apply
to other field types.
Custom similarities can be configured by tuning the parameters of the built-in similarities. For more details about this expert options, see the similarity module.
The only similarities which can be used out of the box, without any further configuration are:
-
BM25
- The Okapi BM25 algorithm. The algorithm used by default in Elasticsearch and Lucene. See Pluggable Similarity Algorithms for more information.
-
classic
- The TF/IDF algorithm which used to be the default in Elasticsearch and Lucene. See Lucene’s Practical Scoring Function for more information.
-
boolean
- A simple boolean similarity, which is used when full-text ranking is not needed and the score should only be based on whether the query terms match or not. Boolean similarity gives terms a score equal to their query boost.
The similarity
can be set on the field level when a field is first created,
as follows: