Rate aggregation

edit

A rate metrics aggregation can be used only inside a date_histogram and calculates a rate of documents or a field in each date_histogram bucket.

Syntax

edit

A rate aggregation looks like this in isolation:

{
  "rate": {
    "unit": "month",
    "field": "requests"
  }
}

The following request will group all sales records into monthly bucket and than convert the number of sales transaction in each bucket into per annual sales rate.

GET sales/_search
{
  "size": 0,
  "aggs": {
    "by_date": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "month"  
      },
      "aggs": {
        "my_rate": {
          "rate": {
            "unit": "year"  
          }
        }
      }
    }
  }
}

Histogram is grouped by month.

But the rate is converted into annual rate.

The response will return the annual rate of transaction in each bucket. Since there are 12 months per year, the annual rate will be automatically calculated by multiplying monthly rate by 12.

{
  ...
  "aggregations" : {
    "by_date" : {
      "buckets" : [
        {
          "key_as_string" : "2015/01/01 00:00:00",
          "key" : 1420070400000,
          "doc_count" : 3,
          "my_rate" : {
            "value" : 36.0
          }
        },
        {
          "key_as_string" : "2015/02/01 00:00:00",
          "key" : 1422748800000,
          "doc_count" : 2,
          "my_rate" : {
            "value" : 24.0
          }
        },
        {
          "key_as_string" : "2015/03/01 00:00:00",
          "key" : 1425168000000,
          "doc_count" : 2,
          "my_rate" : {
            "value" : 24.0
          }
        }
      ]
    }
  }
}

Instead of counting the number of documents, it is also possible to calculate a sum of all values of the fields in the documents in each bucket. The following request will group all sales records into monthly bucket and than calculate the total monthly sales and convert them into average daily sales.

GET sales/_search
{
  "size": 0,
  "aggs": {
    "by_date": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "month"  
      },
      "aggs": {
        "avg_price": {
          "rate": {
            "field": "price", 
            "unit": "day"  
          }
        }
      }
    }
  }
}

Histogram is grouped by month.

Calculate sum of all sale prices

Convert to average daily sales

The response will contain the average daily sale prices for each month.

{
  ...
  "aggregations" : {
    "by_date" : {
      "buckets" : [
        {
          "key_as_string" : "2015/01/01 00:00:00",
          "key" : 1420070400000,
          "doc_count" : 3,
          "avg_price" : {
            "value" : 17.741935483870968
          }
        },
        {
          "key_as_string" : "2015/02/01 00:00:00",
          "key" : 1422748800000,
          "doc_count" : 2,
          "avg_price" : {
            "value" : 2.142857142857143
          }
        },
        {
          "key_as_string" : "2015/03/01 00:00:00",
          "key" : 1425168000000,
          "doc_count" : 2,
          "avg_price" : {
            "value" : 12.096774193548388
          }
        }
      ]
    }
  }
}

Relationship between bucket sizes and rate

edit

The rate aggregation supports all rate that can be used calendar_intervals parameter of date_histogram aggregation. The specified rate should compatible with the date_histogram aggregation interval, i.e. it should be possible to convert the bucket size into the rate. By default the interval of the date_histogram is used.

"rate": "second"
compatible with all intervals
"rate": "minute"
compatible with all intervals
"rate": "hour"
compatible with all intervals
"rate": "day"
compatible with all intervals
"rate": "week"
compatible with all intervals
"rate": "month"
compatible with only with month, quarter and year calendar intervals
"rate": "quarter"
compatible with only with month, quarter and year calendar intervals
"rate": "year"
compatible with only with month, quarter and year calendar intervals

Script

edit

The rate aggregation also supports scripting. For example, if we need to adjust out prices before calculating rates, we could use a script to recalculate them on-the-fly:

GET sales/_search
{
  "size": 0,
  "aggs": {
    "by_date": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "month"
      },
      "aggs": {
        "avg_price": {
          "rate": {
            "script": {  
              "lang": "painless",
              "source": "doc['price'].value * params.adjustment",
              "params": {
                "adjustment": 0.9  
              }
            }
          }
        }
      }
    }
  }
}

The field parameter is replaced with a script parameter, which uses the script to generate values which percentiles are calculated on.

Scripting supports parameterized input just like any other script.

{
  ...
  "aggregations" : {
    "by_date" : {
      "buckets" : [
        {
          "key_as_string" : "2015/01/01 00:00:00",
          "key" : 1420070400000,
          "doc_count" : 3,
          "avg_price" : {
            "value" : 495.0
          }
        },
        {
          "key_as_string" : "2015/02/01 00:00:00",
          "key" : 1422748800000,
          "doc_count" : 2,
          "avg_price" : {
            "value" : 54.0
          }
        },
        {
          "key_as_string" : "2015/03/01 00:00:00",
          "key" : 1425168000000,
          "doc_count" : 2,
          "avg_price" : {
            "value" : 337.5
          }
        }
      ]
    }
  }
}