Querying Elasticsearch

edit

Querying Elasticsearch

edit

By default, Vega’s data element can use embedded and external data with a "url" parameter. Kibana adds support for the direct Elasticsearch queries by overloading the "url" value.

Here is an example of an Elasticsearch query that counts the number of documents in all indexes. The query uses @timestamp field to filter the time range, and break it into histogram buckets.

// An object instead of a string for the url value
// is treated as a context-aware Elasticsearch query.
url: {
  // Specify the time filter (upper right corner) with this field
  %timefield%: @timestamp
  // Apply dashboard context filters when set
  %context%: true

  // Which indexes to search
  index: _all
  // The body element may contain "aggs" and "query" subfields
  body: {
    aggs: {
      time_buckets: {
        date_histogram: {
          // Use date histogram aggregation on @timestamp field
          field: @timestamp
          // interval value will depend on the daterange picker
          // Use an integer to set approximate bucket count
          interval: { %autointerval%: true }
          // Make sure we get an entire range, even if it has no data
          extended_bounds: {
            min: { %timefilter%: "min" }
            max: { %timefilter%: "max" }
          }
          // Use this for linear (e.g. line, area) graphs
          // Without it, empty buckets will not show up
          min_doc_count: 0
        }
      }
    }
    // Speed up the response by only including aggregation results
    size: 0
  }
}

The full result has this kind of structure:

{
  "aggregations": {
    "time_buckets": {
      "buckets": [{
          "key_as_string": "2015-11-30T22:00:00.000Z",
          "key": 1448920800000,
          "doc_count": 28
        }, {
          "key_as_string": "2015-11-30T23:00:00.000Z",
          "key": 1448924400000,
          "doc_count": 330
        }, ...

Note that "key" is a unix timestamp, and can be used without conversions by the Vega date expressions.

For most graphs we only need the list of the bucket values, so we use format: {property: "aggregations.time_buckets.buckets"} expression to focus on just the data we need.

Query may be specified with individual range and dashboard context as well. This query is equivalent to "%context%": true, "%timefield%": "@timestamp", except that the timerange is shifted back by 10 minutes:

{
  body: {
    query: {
      bool: {
        must: [
          // This string will be replaced
          // with the auto-generated "MUST" clause
          "%dashboard_context-must_clause%"
          {
            range: {
              // apply timefilter (upper right corner)
              // to the @timestamp variable
              @timestamp: {
                // "%timefilter%" will be replaced with
                // the current values of the time filter
                // (from the upper right corner)
                "%timefilter%": true
                // Only work with %timefilter%
                // Shift current timefilter by 10 units back
                shift: 10
                // week, day (default), hour, minute, second
                unit: minute
              }
            }
          }
        ]
        must_not: [
          // This string will be replaced with
          // the auto-generated "MUST-NOT" clause
          "%dashboard_context-must_not_clause%"
        ]
      }
    }
  }
}

The "%timefilter%" can also be used to specify a single min or max value. The date_histogram’s extended_bounds can be set with two values - min and max. Instead of hardcoding a value, you may use "min": {"%timefilter%": "min"}, which will be replaced with the beginning of the current time range. The shift and unit values are also supported. The "interval" can also be set dynamically, depending on the currently picked range: "interval": {"%autointerval%": 10} will try to get about 10-15 data points (buckets).