WARNING: The 2.x versions of Elasticsearch have passed their EOL dates. If you are running a 2.x version, we strongly advise you to upgrade.

This documentation is no longer maintained and may be removed. For the latest information, see the current Elasticsearch documentation.

« Postcodes and Structured Data wildcard and regexp Queries »

› › ›

prefix Query

edit

prefix Query

edit

To find all postcodes beginning with W1, we could use a simple prefix query:

GET /my_index/address/_search
{
    "query": {
        "prefix": {
            "postcode": "W1"
        }
    }
}

Copy as curl View in Sense

The prefix query is a low-level query that works at the term level. It doesn’t analyze the query string before searching. It assumes that you have passed it the exact prefix that you want to find.

By default, the prefix query does no relevance scoring. It just finds matching documents and gives them all a score of 1. Really, it behaves more like a filter than a query. The only practical difference between the prefix query and the prefix filter is that the filter can be cached.

Previously, we said that “you can find only terms that exist in the inverted index,” but we haven’t done anything special to index these postcodes; each postcode is simply indexed as the exact value specified in each document. So how does the prefix query work?

Remember that the inverted index consists of a sorted list of unique terms (in this case, postcodes). For each term, it lists the IDs of the documents containing that term in the postings list. The inverted index for our example documents looks something like this:

Term:          Doc IDs:
-------------------------
"SW5 0BE"    |  5
"W1F 7HW"    |  3
"W1V 3DG"    |  1
"W2F 8HW"    |  2
"WC1N 1LZ"   |  4
-------------------------

To support prefix matching on the fly, the query does the following:

Skips through the terms list to find the first term beginning with W1.
Collects the associated document IDs.
Moves to the next term.
If that term also begins with W1, the query repeats from step 2; otherwise, we’re finished.

While this works fine for our small example, imagine that our inverted index contains a million postcodes beginning with W1. The prefix query would need to visit all one million terms in order to calculate the result!

And the shorter the prefix, the more terms need to be visited. If we were to look for the prefix W instead of W1, perhaps we would match 10 million terms instead of just one million.

The prefix query or filter are useful for ad hoc prefix matching, but should be used with care. They can be used freely on fields with a small number of terms, but they scale poorly and can put your cluster under a lot of strain. Try to limit their impact on your cluster by using a long prefix; this reduces the number of terms that need to be visited.

Later in this chapter, we present an alternative index-time solution that makes prefix matching much more efficient. But first, we’ll take a look at two related queries: the wildcard and regexp queries.

« Postcodes and Structured Data wildcard and regexp Queries »

Was this helpful?

Feedback

The Search AI Company

ELK Stack

Elastic Cloud

Generative AI

Search

Security

Observability

By solution

Industries

Customer spotlight

Research

Build

Learn

Connect

prefix Query

prefix Query

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards