Precision tuning (beta)
editPrecision tuning (beta)
editThis functionality is in beta. Beta features are subject to change and are not covered by the support SLA of general release (GA) features. Elastic plans to promote this feature to GA in a future release.
Precision tuning has no effect in queries that contain synonyms.
To use precision tuning with an Elasticsearch based engine, you must define text subfields that conform to the following: Elasticsearch engines text field conventions.
App Search defaults to high recall results: we cast a wide net on your searches. You can use precision tuning to search with a different level of precision and recall—tightening or loosening the term and phrase requirements needed for a document to be considered a match to a given query. Generally, more precision leads to less recall: getting more specific results usually comes at the cost of a lower tolerance of errors or variations in queries.
Tune precision
editTo tune precision, set precision tuning values in any of the following ways:
Tune precision per engine using the UI
editTune precision within Kibana: Navigate to Search → Enterprise Search → App Search → Engines → engine name → Relevance tuning.
Locate Precision tuning, and set the default precision value for the engine.
The value will therefore apply to all queries sent to that engine that don’t provide their own precision
value.
Precision tuning is not available for Elasticsearch based engines that do not conform to Elasticsearch engines text field conventions.
Tune precision per engine using the API
editUse the search settings API to set the default precision
value for an engine.
The value will therefore apply to all queries sent to that engine that don’t provide their own precision
value.
Tune precision per query using the API
editUse the precision
parameter of the search API to set the value for a particular query.
When set per query, the precision
value overrides the default value set for the engine.
Precision tuning values
editPrecision tuning combines analyzers, fuzzy queries and term and phrase matching using numeric values.
Precision tuning values are integers that range from 1
to 11
.
The range of values represents a sliding scale that manages the inherent tradeoff between precision and recall.
Lower values favor recall, while higher values favor precision.
The precision tuning value for a query changes which documents match that query.
The following table describes each precision tuning value, including the affect on analyzers, fuzzy queries and term and phrase matching. You can change the precision tuning for the same query and observe the effects it has on the results. Experiment with different values to find the value that works best with each engine’s documents.
Value | Description | Analyzers | Fuzzy queries | Phrase matching |
---|---|---|---|---|
|
Lowest precision and highest recall setting. |
All |
Yes |
At least one term in any field must match. |
|
Default. High recall, low precision. |
All |
Yes |
Less than half of the terms must match. |
|
Increasing phrase matching: half the terms. |
All |
Yes |
Queries with two or fewer terms require all terms to match. With more terms, half the terms must match (rounded up). |
|
Increasing phrase matching: three-quarters of the terms. |
All |
Yes |
Queries with three or fewer terms require all terms to match, then three-quarters of terms must match (rounded down). |
|
Increasing phrase matching requirements: all but one of the terms. |
All |
Yes |
Queries with four or fewer terms require all terms to match, then all but one terms must match. |
|
All terms must match. |
All |
Yes |
Every term must appear in the document, in any field. |
|
The strictest phrase matching requirement: all terms must match, and in the same field. |
All |
Yes |
Every term must appear in the document. |
|
Decreasing typo tolerance: advanced typo tolerance is disabled. |
All |
No |
Every term must appear in the same field in the document. |
|
Decreasing term matching: prefixing is disabled. |
Default Stem Joined |
No |
Every term must appear in the same field. |
|
Decreasing typo-tolerance: no compound-word correction. |
Default Stem |
No |
Every term must appear in the same field. |
|
Exact spelling matches only. |
Default |
No |
Every tokenized term must appear in the same field. NOTE: This is not an exact match against the field value (e.g. a search for "PART-123" can return documents that contain both tokenized "PART" and "123" terms such as "PART-123-456"). To exactly match a field value, use Search API filters. |
Precision tuning concepts
editThe following concepts describe how precision tuning works:
Analyzers
editPrecision tuning works by using different analyzers for your documents fields, using multi fields. Using different analyzers allows you to change how search queries look for results.
Enterprise Search provides the following analyzers for text fields:
Default analyzer
editThe default analyzer does not change the analysis for a text field. It merely ignores upper and lower casing, and removes stop words according to the language used for the engine.
For example, a "a brand new super duper model" query would match a document containing "Brand New Super Duper model".
Stem analyzer
editThe stem analyzer tries to retrieve the root from the different words that are introduced. This ensures variants of a word match during a search.
This analyzer depends on the language chosen for the engine, as different strategies are used for obtaining the roots for a word depending on the language.
For example, a query for "the fox jumps" would match a document containing "the foxes jumping". This ensures that any variation on the verb used (jump, jumps, jumping) or the noun (fox, foxes) are not taken into account for retrieving the search results.
For more information, see stemming
Prefix analyzer
editThe prefix analyzer uses a query as a prefix for matching words. Using the prefix analyzer, a "congrat" query would match documents containing both "congratulations" and "congrats", as they share the same prefix as the input query.
The prefix analyzer is useful for autocomplete and suggestion search types, where we are interested in results that match a specified prefix.
Joined analyzer
editThe joined analyzer checks separate words as if they were a single word. For example, a query for "ecommerce" would match a document containing "e commerce".
It is useful to allow for words that can appear joined or separated in searches and document results.
Delimiter analyzer
editThe delimiter analyzer removes some delimiters that might not be meaningful to the search. For example, a query containing "super-duper-xl" would match documents containing "super duper xl" and "superduperxl".
It is similar to the joined analyzer conceptually, but instead of being focused on words or part of words, it removes delimiters that may not be meaningful to the search.
It is useful to remove delimiters and focus on the text content of the search.
For more information about the delimiter capabilities, see delimiter token filter.
Fuzzy queries
editFuzzy queries create small variations of the query terms, by changing one or more characters:
- Changing a character (box → fox)
- Removing a character (black → lack)
- Inserting a character (sic → sick)
- Transposing two adjacent characters (act → cat)
Depending on how long each query term is, more characters or fewer characters are allowed to change:
- Words 1 or 2 letters long, or the first two letters of a longer word, must match exactly ("at" won’t match "ax", "click" won’t match "slick")
- Words with length 3 to 5 can differ in 1 character ("click" will match "clack")
- Words with more than 5 letters can differ in 2 characters ("fussiness" will match "fuzziness")
Only the default analyzer and the stem analyzer are used in fuzzy queries.
Fuzzy queries are helpful for allowing typo tolerance in searches.
Matching terms and phrases
editApp Search matches documents to a query at the term and phrase levels.
- Term matching refers to how App Search handles individual terms within queries and documents. Terms are usually words, but can be any arbitrary group of letters or numbers. App Search uses Analyzers to process text into terms.
- Phrase Matching applies when a query contains multiple terms. When determining which documents match a query, App Search may consider the number of query terms that appear in the document, the ordering of the terms, or where in the document the terms appear (for example, within the same field). See Phrase matching examples for examples on phrase matching.
Troubleshooting precision tuning
editPrecision tuning is not an exact science. You may find that some results are not what you would expect.
These are some recommendations for understanding precision tuning results:
- Review the descriptions for the precision tuning values. Review the analyzers and matching for the current precision tuning value.
- Within the precision tuning UI, test different values using the precision slider, and experiment with different query terms.
- Use the Search Explain API to understand the Elasticsearch query for different precision settings.
- Use the Elasticsearch Search Explain API to understand why a particular result does (or does not) match an Elasticsearch query
Examples
editPhrase matching examples
editConsider an engine with a single document with a title field "American Samoa National Park".
Let’s change precision tuning values and check what the results are:
Precision value |
Query |
Results |
Explanation |
|
|
Yes |
A single term in the query ( |
|
|
No |
Fewer than half of the terms in the query match (only |
|
|
Yes |
Half of the terms in the query ( |
|
|
No |
It’s a 2 term query, all elements must match |
|
|
Yes |
It’s a 2 term query, and all elements match |
|
|
Yes |
It’s a query with more than 2 terms, so it’s enough for half of them to match |
|
|
No |
It’s a query with 3 terms, so every term must match |
|
|
No |
It’s a query with more than 3 terms, and only half of them match ( |
|
|
Yes |
It’s a query with more than 3 terms, and three quarters of them match ( |
|
|
No |
It’s a query with 4 terms, all terms should match |
|
|
Yes |
It’s a query with more than 4 terms, all terms but one should match ( |
|
|
No |
It’s a query with more than 4 terms, all terms but one should match ( |
Term matching examples
editConsider an engine with a single document with the following fields:
-
title
:e-commerce results
-
year
:FY 2022
Let’s change precision tuning values and check what the results are:
Precision values |
Query |
Results |
Explanation. |
|
|
Yes |
Delimiter analyzer allows matching |
|
|
No |
Delimiter analyzer is not active for precision levels 10-11, so |
|
|
Yes |
Joined analyzer allows matching |
|
FY2022 |
No |
Joined analyzer is not active for precision levels 10-11, so |
|
|
Yes |
Stemming is used for finding the root for |
|
|
No |
Stemming is disabled at value 11. |
|
|
Yes |
Prefix analyzer is used for retrieving |
|
|
No |
Prefixing is disabled from precision value 9. |
|
|
Yes |
Fuzzy matching allows up to two characters (as it’s a word longer than 5 characters) to be missing from |
|
|
No |
Fuzzy matching is disabled, |
|
|
Yes |
Query terms can be present in any field ( |
|
|
No |
Query terms must be present in the same field from precision level 7.
|