Filtered Query

edit

The filtered query is used to combine another query with any filter. Filters are usually faster than queries because:

  • they don’t have to calculate the relevance _score for each document — the answer is just a boolean “Yes, the document matches the filter” or “No, the document does not match the filter”.
  • the results from most filters can be cached in memory, making subsequent executions faster.

Exclude as many document as you can with a filter, then query just the documents that remain.

{
  "filtered": {
    "query": {
      "match": { "tweet": "full text search" }
    },
    "filter": {
      "range": { "created": { "gte": "now - 1d / d" }}
    }
  }
}

The filtered query can be used wherever a query is expected, for instance, to use the above example in search request:

curl -XGET localhost:9200/_search -d '
{
  "query": {
    "filtered": { 
      "query": {
        "match": { "tweet": "full text search" }
      },
      "filter": {
        "range": { "created": { "gte": "now - 1d / d" }}
      }
    }
  }
}
'

The filtered query is passed as the value of the query parameter in the search request.

Filtering without a query

edit

If a query is not specified, it defaults to the match_all query. This means that the filtered query can be used to wrap just a filter, so that it can be used wherever a query is expected.

curl -XGET localhost:9200/_search -d '
{
  "query": {
    "filtered": { 
      "filter": {
        "range": { "created": { "gte": "now - 1d / d" }}
      }
    }
  }
}
'

No query has been specified, so this request applies just the filter, returning all documents created since yesterday.

Multiple filters

edit

Multiple filters can be applied by wrapping them in a bool filter, for example:

{
  "filtered": {
    "query": { "match": { "tweet": "full text search" }},
    "filter": {
      "bool": {
        "must": { "range": { "created": { "gte": "now - 1d / d" }}},
        "should": [
          { "term": { "featured": true }},
          { "term": { "starred":  true }}
        ],
        "must_not": { "term": { "deleted": false }}
      }
    }
  }
}

Similarly, multiple queries can be combined with a bool query.

Filter strategy

edit

You can control how the filter and query are executed with the strategy parameter:

{
    "filtered" : {
        "query" :   { ... },
        "filter" :  { ... },
        "strategy": "leap_frog"
    }
}

This is an expert-level setting. Most users can simply ignore it.

The strategy parameter accepts the following options:

leap_frog_query_first

Look for the first document matching the query, and then alternatively advance the query and the filter to find common matches.

leap_frog_filter_first

Look for the first document matching the filter, and then alternatively advance the query and the filter to find common matches.

leap_frog

Same as leap_frog_query_first.

query_first

If the filter supports random access, then search for documents using the query, and then consult the filter to check whether there is a match. Otherwise fall back to leap_frog_query_first.

random_access_${threshold}

If the filter supports random access and if there is at least one matching document among the first threshold ones, then apply the filter first. Otherwise fall back to leap_frog_query_first. ${threshold} must be greater than or equal to 1.

random_access_always

Apply the filter first if it supports random access. Otherwise fall back to leap_frog_query_first.

The default strategy is to use query_first on filters that are not advanceable such as geo filters and script filters, and random_access_100 on other filters.