Validate API

edit

Validates a potentially expensive query without executing it.

GET twitter/_validate/query?q=user:foo

Request

edit

GET /<index>/_validate/<query>

Description

edit

The validate API allows you to validate a potentially expensive query without executing it. The query can be sent either as a path parameter or in the request body.

Path parameters

edit
<index>
(Optional, string) Comma-separated list or wildcard expression of index names used to limit the request.
query
(Optional, query object) Defines the search definition using the Query DSL.

Query parameters

edit
all_shards
(Optional, boolean) If true, the validation is executed on all shards instead of one random shard per index. Defaults to false.
allow_no_indices

(Optional, boolean) If true, the request does not return an error if a wildcard expression or _all value retrieves only missing or closed indices.

This parameter also applies to index aliases that point to a missing or closed index.

Defaults to false.

analyzer
(Optional, string) Analyzer to use for the query string.
analyze_wildcard
(Optional, boolean) If true, wildcard and prefix queries are analyzed. Defaults to false.
default_operator
(Optional, string) The default operator for query string query: AND or OR. Defaults to OR.
df
(Optional, string) Field to use as default where no field prefix is given in the query string.
expand_wildcards

(Optional, string) Controls what kind of indices that wildcard expressions can expand to. Valid values are:

all
Expand to open and closed indices.
open
Expand only to open indices.
closed
Expand only to closed indices.
none
Wildcard expressions are not accepted.
explain
(Optional, boolean) If true, the response returns detailed information if an error has occurred. Defaults to false.
ignore_unavailable
(Optional, boolean) If true, missing or closed indices are not included in the response. Defaults to false.
lenient
(Optional, boolean) If true, format-based query failures (such as providing text to a numeric field) will be ignored. Defaults to false.
rewrite
(Optional, boolean) If true, returns a more detailed explanation showing the actual Lucene query that will be executed. Defaults to false.
q
(Optional, string) Query in the Lucene query string syntax.

Examples

edit
PUT twitter/_bulk?refresh
{"index":{"_id":1}}
{"user" : "kimchy", "post_date" : "2009-11-15T14:12:12", "message" : "trying out Elasticsearch"}
{"index":{"_id":2}}
{"user" : "kimchi", "post_date" : "2009-11-15T14:12:13", "message" : "My username is similar to @kimchy!"}

When sent a valid query:

GET twitter/_validate/query?q=user:foo

The response contains valid:true:

{"valid":true,"_shards":{"total":1,"successful":1,"failed":0}}

The query may also be sent in the request body:

GET twitter/_validate/query
{
  "query" : {
    "bool" : {
      "must" : {
        "query_string" : {
          "query" : "*:*"
        }
      },
      "filter" : {
        "term" : { "user" : "kimchy" }
      }
    }
  }
}

The query being sent in the body must be nested in a query key, same as the search api works

If the query is invalid, valid will be false. Here the query is invalid because Elasticsearch knows the post_date field should be a date due to dynamic mapping, and foo does not correctly parse into a date:

GET twitter/_validate/query
{
  "query": {
    "query_string": {
      "query": "post_date:foo",
      "lenient": false
    }
  }
}
{"valid":false,"_shards":{"total":1,"successful":1,"failed":0}}

The explain parameter

edit

An explain parameter can be specified to get more detailed information about why a query failed:

GET twitter/_validate/query?explain=true
{
  "query": {
    "query_string": {
      "query": "post_date:foo",
      "lenient": false
    }
  }
}

The API returns the following response:

{
  "valid" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "explanations" : [ {
    "index" : "twitter",
    "valid" : false,
    "error" : "twitter/IAEc2nIXSSunQA_suI0MLw] QueryShardException[failed to create query:...failed to parse date field [foo]"
  } ]
}

The rewrite parameter

edit

When the query is valid, the explanation defaults to the string representation of that query. With rewrite set to true, the explanation is more detailed showing the actual Lucene query that will be executed.

GET twitter/_validate/query?rewrite=true
{
  "query": {
    "more_like_this": {
      "like": {
        "_id": "2"
      },
      "boost_terms": 1
    }
  }
}

The API returns the following response:

{
   "valid": true,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "explanations": [
      {
         "index": "twitter",
         "valid": true,
         "explanation": "((user:terminator^3.71334 plot:future^2.763601 plot:human^2.8415773 plot:sarah^3.4193945 plot:kyle^3.8244398 plot:cyborg^3.9177752 plot:connor^4.040236 plot:reese^4.7133346 ... )~6) -ConstantScore(_id:2)) #(ConstantScore(_type:_doc))^0.0"
      }
   ]
}

Rewrite and all_shards parameters

edit

By default, the request is executed on a single shard only, which is randomly selected. The detailed explanation of the query may depend on which shard is being hit, and therefore may vary from one request to another. So, in case of query rewrite the all_shards parameter should be used to get response from all available shards.

GET twitter/_validate/query?rewrite=true&all_shards=true
{
  "query": {
    "match": {
      "user": {
        "query": "kimchy",
        "fuzziness": "auto"
      }
    }
  }
}

The API returns the following response:

{
  "valid": true,
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "explanations": [
    {
      "index": "twitter",
      "shard": 0,
      "valid": true,
      "explanation": "(user:kimchi)^0.8333333 user:kimchy"
    }
  ]
}