Percentiles Bucket Aggregation
editPercentiles Bucket Aggregation
editThis functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
A sibling pipeline aggregation which calculates percentiles across all bucket of a specified metric in a sibling aggregation. The specified metric must be numeric and the sibling aggregation must be a multi-bucket aggregation.
Syntax
editA percentiles_bucket
aggregation looks like this in isolation:
{ "percentiles_bucket": { "buckets_path": "the_sum" } }
Table 8. percentiles_bucket
Parameters
Parameter Name | Description | Required | Default Value |
---|---|---|---|
|
The path to the buckets we wish to find the sum for (see |
Required |
|
|
The policy to apply when gaps are found in the data (see Dealing with gaps in the data for more details) |
Optional |
|
|
format to apply to the output value of this aggregation |
Optional |
|
|
The list of percentiles to calculate |
Optional |
|
The following snippet calculates the sum of all the total monthly sales
buckets:
POST /sales/_search { "size": 0, "aggs" : { "sales_per_month" : { "date_histogram" : { "field" : "date", "interval" : "month" }, "aggs": { "sales": { "sum": { "field": "price" } } } }, "percentiles_monthly_sales": { "percentiles_bucket": { "buckets_path": "sales_per_month>sales", "percents": [ 25.0, 50.0, 75.0 ] } } } }
|
|
|
And the following may be the response:
{ "took": 11, "timed_out": false, "_shards": ..., "hits": ..., "aggregations": { "sales_per_month": { "buckets": [ { "key_as_string": "2015/01/01 00:00:00", "key": 1420070400000, "doc_count": 3, "sales": { "value": 550.0 } }, { "key_as_string": "2015/02/01 00:00:00", "key": 1422748800000, "doc_count": 2, "sales": { "value": 60.0 } }, { "key_as_string": "2015/03/01 00:00:00", "key": 1425168000000, "doc_count": 2, "sales": { "value": 375.0 } } ] }, "percentiles_monthly_sales": { "values" : { "25.0": 375.0, "50.0": 375.0, "75.0": 550.0 } } } }
Percentiles_bucket implementation
editThe Percentile Bucket returns the nearest input data point that is not greater than the requested percentile; it does not interpolate between data points.
The percentiles are calculated exactly and is not an approximation (unlike the Percentiles Metric). This means
the implementation maintains an in-memory, sorted list of your data to compute the percentiles, before discarding the
data. You may run into memory pressure issues if you attempt to calculate percentiles over many millions of
data-points in a single percentiles_bucket
.