WARNING: Version 1.4 of Elasticsearch has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
Histogram Facets
editHistogram Facets
editFacets are deprecated and will be removed in a future release. You are encouraged to migrate to aggregations instead.
The equivalent aggregation would be the
histogram
aggregation.
The histogram facet works with numeric data by building a histogram across intervals of the field values. Each value is "rounded" into an interval (or placed in a bucket), and statistics are provided per interval/bucket (count and total). Here is a simple example:
{ "query" : { "match_all" : {} }, "facets" : { "histo1" : { "histogram" : { "field" : "field_name", "interval" : 100 } } } }
The above example will run a histogram facet on the field_name
field,
with an interval
of 100
(so, for example, a value of 1055
will be
placed within the 1000
bucket).
The interval can also be provided as a time based interval (using the time format). This mainly make sense when working on date fields or field that represent absolute milliseconds, here is an example:
{ "query" : { "match_all" : {} }, "facets" : { "histo1" : { "histogram" : { "field" : "field_name", "time_interval" : "1.5h" } } } }
Key and Value
editThe histogram facet allows to use a different key and value. The key is used to place the hit/document within the appropriate bucket, and the value is used to compute statistical data (for example, total). Here is an example:
{ "query" : { "match_all" : {} }, "facets" : { "histo1" : { "histogram" : { "key_field" : "key_field_name", "value_field" : "value_field_name", "interval" : 100 } } } }
Script Key and Value
editSometimes, some munging of both the key and the value are needed. In the key case, before it is rounded into a bucket, and for the value, when the statistical data is computed per bucket scripts can be used. Here is an example:
{ "query" : { "match_all" : {} }, "facets" : { "histo1" : { "histogram" : { "key_script" : "doc['date'].date.minuteOfHour", "value_script" : "doc['num1'].value" } } } }
In the above sample, we can use a date type field called date
to get
the minute of hour from it, and the total will be computed based on
another field num1
. Note, in this case, no interval
was provided, so
the bucket will be based directly on the key_script
(no rounding).
Parameters can also be provided to the different scripts (preferable if the script is the same, with different values for a specific parameter, like "factor"):
{ "query" : { "match_all" : {} }, "facets" : { "histo1" : { "histogram" : { "key_script" : "doc['date'].date.minuteOfHour * factor1", "value_script" : "doc['num1'].value + factor2", "params" : { "factor1" : 2, "factor2" : 3 } } } } }
Memory Considerations
editIn order to implement the histogram facet, the relevant field values are
loaded into memory from the index. This means that per shard, there
should be enough memory to contain them. Since by default, dynamic
introduced types are long
and double
, one option to reduce the
memory footprint is to explicitly set the types for the relevant fields
to either short
, integer
, or float
when possible.