Query Suggestions Guide

edit

Query Suggestions Guide

edit

How do query suggestions work?

How do query suggestion queries differ from search queries?

A Query Suggestion provides recommended queries.

It’s been called autocomplete, typeahead... It’s a feature with many names.

Suggestions work against indexed data and do not relate to previous or popular search queries.

It’s really a custom built, refined search query.

We’ll break down how it works.

Query Suggestion v. Search

edit

There are three important things to provide to a query suggestion request.

  1. Query. Usually a partial query. It’s a word, phrase, or text fragment which will be used to find "good matches".
  2. Fields. For which fields do you want suggestions?
  3. Number of Suggestions. Usually around 5 feels good, but you can return up to 20.

The sample National Park data set found throughout the documentation makes for good demonstration:

curl -X POST 'https://[instance id].ent-search.[region].[provider].cloud.es.io/api/as/v1/engines/national-parks-demo/query_suggestion' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer search-7eud55t7ecdmqzcanjsc9cqu' \
-d '{
  "query": "car",
  "types": {
    "documents": {
      "fields": [
        "title",
        "states"
      ]
    }
  },
  "size": 3
}'

The above query looks for a partial term - "car" - in the title and states fields.

It requests up to 3 suggestions using the size parameter.

{
  "results": {
    "documents": [
      {
        "suggestion": "carlsbad"
      },
      {
        "suggestion": "carlsbad caverns"
      },
      {
        "suggestion": "carolina"
      }
    ]
  },
  "meta": {
    "request_id": "914f909793379ed5af9379b4401f19be"
  }
}

Three suggestions appeared!

But what does this mean? What happened?

A Query Suggestion is just a search query, but with a refined interface and server side logical optimizations.

In other words, App Search will tidy up the data and return it in a way that’s desirable for many search use cases.

Now let’s compare it to a regular search query, with similar parameters:

curl -X POST 'https://[instance id].ent-search.[region].[provider].cloud.es.io/api/as/v1/engines/national-parks-demo/search' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer search-soaewu2ye6uc45dr8mcd54v8' \
-d '{
  "search_fields": {
    "title": {},
    "states": {}
  },
  "result_fields": {
    "title": {
      "raw": {
        "size": 50
      }
    },
    "states": {
      "raw": {
        "size": 50
      }
    }
  },
  "page": {
    "size": 3,
    "current": 1
  },
  "query": "car"
}'

We added search_fields to limit the searched fields to title and states.

And we added result_fields to limit the returned fields to title and states.

... Then page to reduce the size of the response to 3, and of course the query.

The results return more of the underlying document, as one might expect:

{
  "meta": {
    "alerts": [],
    "warnings": [],
    "page": {
      "current": 1,
      "total_pages": 1,
      "total_results": 3,
      "size": 3
    },
    "request_id": "ac054e74e3a1857f08f4ba07ddc10160"
  },
  "results": [
    {
      "title": {
        "raw": "Congaree"
      },
      "states": {
        "raw": [
          "South Carolina"
        ]
      },
      "id": {
        "raw": "park_congaree"
      },
      "_meta": {
        "score": 0.29399884
      }
    },
    {
      "title": {
        "raw": "Carlsbad Caverns"
      },
      "states": {
        "raw": [
          "New Mexico"
        ]
      },
      "id": {
        "raw": "park_carlsbad-caverns"
      },
      "_meta": {
        "score": 0.2734595
      }
    },
    {
      "title": {
        "raw": "Great Smoky Mountains"
      },
      "states": {
        "raw": [
          "Tennessee",
          "North Carolina"
        ]
      },
      "id": {
        "raw": "park_great-smoky-mountains"
      },
      "_meta": {
        "score": 0.24268812
      }
    }
  ]
}

By looking at this search response, we can see the basis for the query suggestion response, which returned:

  1. "Carlsbad"
  2. "Carlsbad Caverns"
  3. "Carolina"

It found "Carlsbad" and "Carlsbad Caverns" as a solid title match and "Carolina" as a great states match.

But why were "Carlsbad" and "Carlsbad Caverns" first? Why was it in a different order than the search query?

App-lastic Search

edit

App Search is built on Elasticsearch.

Each App Search query is converted into a refined Elasticsearch query.

Underneath App Search query suggestion and search queries are multi_match Elasticsearch queries.

The type of multi_match query differs between a query suggestion and search query.

Query suggestions use the best_fields type.

The best_fields type is useful when searching for multiple words in the same field.

Search uses the cross_fields type.

The cross_fields type is useful when looking across multiple fields.

Read the Elasticsearch documentation to multi_match in greater depth.

There are differences other than type, too: a search query will apply stemming and prefixing matching, for example.

A query suggestion will match deeper on a single field and apply less search logic when compared to a search query.

This leads to faster, more concise, purpose built queries which return a suggestion.

The goal of a suggestion is to find a match, then suggest that match to a user who will use it for their own query.

It’s like "search before search".

Search, on the other hand, gets "logically deep" into establishing relevance across all available fields.