IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

« Meta Engines Guide Query Suggestions Guide »

› ›

Precision tuning (beta)

edit

Precision tuning (beta)

edit

This functionality is in beta. Beta features are subject to change and are not covered by the support SLA of general release (GA) features. Elastic plans to promote this feature to GA in a future release.

Precision tuning has no effect in queries that contain synonyms.

To use precision tuning with an Elasticsearch based engine, you must define text subfields that conform to the following: Elasticsearch engines text field conventions.

App Search defaults to high recall results: we cast a wide net on your searches. You can use precision tuning to search with a different level of precision and recall—tightening or loosening the term and phrase requirements needed for a document to be considered a match to a given query. Generally, more precision leads to less recall: getting more specific results usually comes at the cost of a lower tolerance of errors or variations in queries.

Tune precision

edit

To tune precision, set precision tuning values in any of the following ways:

Tune precision per engine using the UI

edit

Tune precision within Kibana: Navigate to Enterprise Search → App Search → Engines → engine name → Relevance tuning. Locate Precision tuning, and set the default precision value for the engine. The value will therefore apply to all queries sent to that engine that don’t provide their own precision value.

Precision tuning is not available for Elasticsearch based engines that do not conform to Elasticsearch engines text field conventions.

Tune precision per engine using the API

edit

Use the search settings API to set the default precision value for an engine. The value will therefore apply to all queries sent to that engine that don’t provide their own precision value.

Tune precision per query using the API

edit

Use the precision parameter of the search API to set the value for a particular query. When set per query, the precision value overrides the default value set for the engine.

Precision tuning values

edit

Precision tuning combines analyzers, fuzzy queries and term and phrase matching using numeric values.

Precision tuning values are integers that range from 1 to 11. The range of values represents a sliding scale that manages the inherent tradeoff between precision and recall. Lower values favor recall, while higher values favor precision.

The precision tuning value for a query changes which documents match that query.

The following table describes each precision tuning value, including the affect on analyzers, fuzzy queries and term and phrase matching. You can change the precision tuning for the same query and observe the effects it has on the results. Experiment with different values to find the value that works best with each engine’s documents.

Value	Description	Analyzers	Fuzzy queries	Phrase matching
`1`	Lowest precision and highest recall setting.	All	Yes	At least one term in any field must match.
`2`	Default. High recall, low precision.	All	Yes	Less than half of the terms must match.
`3`	Increasing phrase matching: half the terms.	All	Yes	Queries with two or fewer terms require all terms to match. With more terms, half the terms must match (rounded up).
`4`	Increasing phrase matching: three-quarters of the terms.	All	Yes	Queries with three or fewer terms require all terms to match, then three-quarters of terms must match (rounded down).
`5`	Increasing phrase matching requirements: all but one of the terms.	All	Yes	Queries with four or fewer terms require all terms to match, then all but one terms must match.
`6`	All terms must match.	All	Yes	Every term must appear in the document, in any field.
`7`	The strictest phrase matching requirement: all terms must match, and in the same field.	All	Yes	Every term must appear in the document.
`8`	Decreasing typo tolerance: advanced typo tolerance is disabled.	All	No	Every term must appear in the same field in the document.
`9`	Decreasing term matching: prefixing is disabled.	Default Stem Joined	No	Every term must appear in the same field.
`10`	Decreasing typo-tolerance: no compound-word correction.	Default Stem	No	Every term must appear in the same field.
`11`	Exact spelling matches only.	Default	No	Every tokenized term must appear in the same field. NOTE: This is not an exact match against the field value (e.g. a search for "PART-123" can return documents that contain both tokenized "PART" and "123" terms such as "PART-123-456"). To exactly match a field value, use Search API filters.

Precision tuning concepts

edit

The following concepts describe how precision tuning works:

Analyzers

edit

Precision tuning works by using different analyzers for your documents fields, using multi fields. Using different analyzers allows you to change how search queries look for results.

Enterprise Search provides the following analyzers for text fields:

Default analyzer

edit

The default analyzer does not change the analysis for a text field. It merely ignores upper and lower casing, and removes stop words according to the language used for the engine.

For example, a "a brand new super duper model" query would match a document containing "Brand New Super Duper model".

Stem analyzer

edit

The stem analyzer tries to retrieve the root from the different words that are introduced. This ensures variants of a word match during a search.

This analyzer depends on the language chosen for the engine, as different strategies are used for obtaining the roots for a word depending on the language.

For example, a query for "the fox jumps" would match a document containing "the foxes jumping". This ensures that any variation on the verb used (jump, jumps, jumping) or the noun (fox, foxes) are not taken into account for retrieving the search results.

For more information, see stemming

Prefix analyzer

edit

The prefix analyzer uses a query as a prefix for matching words. Using the prefix analyzer, a "congrat" query would match documents containing both "congratulations" and "congrats", as they share the same prefix as the input query.

The prefix analyzer is useful for autocomplete and suggestion search types, where we are interested in results that match a specified prefix.

Joined analyzer

edit

The joined analyzer checks separate words as if they were a single word. For example, a query for "ecommerce" would match a document containing "e commerce".

It is useful to allow for words that can appear joined or separated in searches and document results.

Delimiter analyzer

edit

The delimiter analyzer removes some delimiters that might not be meaningful to the search. For example, a query containing "super-duper-xl" would match documents containing "super duper xl" and "superduperxl".

It is similar to the joined analyzer conceptually, but instead of being focused on words or part of words, it removes delimiters that may not be meaningful to the search.

It is useful to remove delimiters and focus on the text content of the search.

For more information about the delimiter capabilities, see delimiter token filter.

Fuzzy queries

edit

Fuzzy queries create small variations of the query terms, by changing one or more characters:

Changing a character (box → fox)
Removing a character (black → lack)
Inserting a character (sic → sick)
Transposing two adjacent characters (act → cat)

Depending on how long each query term is, more characters or fewer characters are allowed to change:

Words 1 or 2 letters long must match exactly ("at" won’t match "ax")
Words with length 3 to 5 can differ in 1 character ("click" will match "slick")
Words with more than 5 letters can differ in 2 characters ("fussiness" will match "fuzziness")

Only the default analyzer and the stem analyzer are used in fuzzy queries.

Fuzzy queries are helpful for allowing typo tolerance in searches.

Matching terms and phrases

edit

App Search matches documents to a query at the term and phrase levels.

Term matching refers to how App Search handles individual terms within queries and documents. Terms are usually words, but can be any arbitrary group of letters or numbers. App Search uses Analyzers to process text into terms.
Phrase Matching applies when a query contains multiple terms. When determining which documents match a query, App Search may consider the number of query terms that appear in the document, the ordering of the terms, or where in the document the terms appear (for example, within the same field). See Phrase matching examples for examples on phrase matching.

Troubleshooting precision tuning

edit

Precision tuning is not an exact science. You may find that some results are not what you would expect.

These are some recommendations for understanding precision tuning results:

Review the descriptions for the precision tuning values. Review the analyzers and matching for the current precision tuning value.
Within the precision tuning UI, test different values using the precision slider, and experiment with different query terms.
Use the Search Explain API to understand the Elasticsearch query for different precision settings.
Use the Elasticsearch Search Explain API to understand why a particular result does (or does not) match an Elasticsearch query

Examples

edit

Phrase matching examples

edit

Consider an engine with a single document with a title field "American Samoa National Park".

Let’s change precision tuning values and check what the results are:

Precision value	Query	Results	Explanation
`1`	`Joshua Tree Park`	Yes	A single term in the query (`Park`) causes the result to be retrieved
`2`	`Joshua Tree Park`	No	Fewer than half of the terms in the query match (only `Park`)
`2`	`Joshua Tree National Park`	Yes	Half of the terms in the query (`National`, `Park`) match with the document
`3`	`Joshua Park`	No	It’s a 2 term query, all elements must match
`3`	`National Park`	Yes	It’s a 2 term query, and all elements match
`3`	`Joshua Tree National Park`	Yes	It’s a query with more than 2 terms, so it’s enough for half of them to match
`4`	`Joshua Tree Park`	No	It’s a query with 3 terms, so every term must match
`4`	`Joshua Tree National Park`	No	It’s a query with more than 3 terms, and only half of them match (`National` `Park`)
`4`	`Joshua Tree American National Park`	Yes	It’s a query with more than 3 terms, and three quarters of them match (`American National Park`)
`5`	`American Tree National Park`	No	It’s a query with 4 terms, all terms should match
`5`	`American Samoa Tree National Park`	Yes	It’s a query with more than 4 terms, all terms but one should match (`Tree` does not match)
`5`	`American Samoa Joshua Tree National Park`	No	It’s a query with more than 4 terms, all terms but one should match (`Joshua` and `Tree` do not match)

Term matching examples

edit

Consider an engine with a single document with the following fields:

title: e-commerce results
year: FY 2022

Let’s change precision tuning values and check what the results are:

Precision values	Query	Results	Explanation.
`1` to `9`	`ecommerce`	Yes	Delimiter analyzer allows matching `ecommerce` query to `e-commerce`.
`10` to `11`	`ecommerce`	No	Delimiter analyzer is not active for precision levels 10-11, so `ecommerce` does not match `e-commerce`.
`1` to `9`	`FY2022`	Yes	Joined analyzer allows matching `FY2022` query to `FY 2022`.
`10` to `11`	FY2022	No	Joined analyzer is not active for precision levels 10-11, so `FY2022` does not match `FY 2022`.
`1` to `10`	`resulting`	Yes	Stemming is used for finding the root for `resulting` and `results`.
`11`	`resulting`	No	Stemming is disabled at value 11.
`1` to `8`	`res`	Yes	Prefix analyzer is used for retrieving `result`.
`9` to `11`	`res`	No	Prefixing is disabled from precision value 9.
`1` to `8`	`comerc`	Yes	Fuzzy matching allows up to two characters (as it’s a word longer than 5 characters) to be missing from `commerce`.
`9` to `11`	`comerc`	No	Fuzzy matching is disabled, `comerc` cannot match `e-commerce`.
`1` to `6`	`e-commerce FY 2022`	Yes	Query terms can be present in any field (`title` and `year`).
`7` to `11`	`e-commerce FY 2022`	No	Query terms must be present in the same field from precision level 7. `e-commerce` is in `title` field and `FY 2022` in `year` field, so there is no match.

« Meta Engines Guide Query Suggestions Guide »