Doc counts and overlapping jobs
editDoc counts and overlapping jobs
editThere is an issue with doc counts, related to the above grouping limitation. Imagine you have two Rollup jobs saving to the same index, where one job is a "subset" of another job.
For example, you might have jobs with these two groupings:
PUT _xpack/rollup/job/sensor-all { "groups" : { "date_histogram": { "field": "timestamp", "interval": "1h", "delay": "7d" }, "terms": { "fields": ["node"] } }, "metrics": [ { "field": "price", "metrics": ["avg"] } ] ... }
and
PUT _xpack/rollup/job/sensor-building { "groups" : { "date_histogram": { "field": "timestamp", "interval": "1h", "delay": "7d" }, "terms": { "fields": ["node", "building"] } } ... }
The first job sensor-all
contains the groupings and metrics that apply to all data in the index. The second job is rolling up a subset
of data (in different buildings) which also include a building identifier. You did this because combining them would run into the limitation
described in the previous section.
This mostly works, but can sometimes return incorrect doc_counts
when you search. All metrics will be valid however.
The issue arises from the composite agg limitation described before, combined with search-time optimization. Imagine you try to run the following aggregation:
"aggs" : { "nodes": { "terms": { "field": "node" } } }
This aggregation could be serviced by either sensor-all
or sensor-building
job, since they both group on the node field. So the RollupSearch
API will search both of them and merge results. This will result in correct doc_counts and correct metrics. No problem here.
The issue arises from an aggregation that can only be serviced by sensor-building
, like this one:
"aggs" : { "nodes": { "terms": { "field": "node" }, "aggs": { "building": { "terms": { "field": "building" } } } } }
Now we run into a problem. The RollupSearch API will correctly identify that only sensor-building
job has all the required components
to answer the aggregation, and will search it exclusively. Unfortunately, due to the composite aggregation limitation, that job only
rolled up documents that have both a "node" and a "building" field. Meaning that the doc_counts for the "nodes"
aggregation will not
include counts for any document that doesn’t have [node, building]
fields.
-
The
doc_count
for"nodes"
aggregation will be incorrect because it only contains counts fornodes
that also have buildings -
The
doc_count
for"buildings"
aggregation will be correct - Any metrics, on any level, will be correct
Workarounds
editThere are two main workarounds if you find yourself with a schema like the above.
Easiest and most robust method: use separate indices to store your rollups. The limitations arise because you have several document schemas co-habitating in a single index, which makes it difficult for rollups to correctly summarize. If you make several rollup jobs and store them in separate indices, these sorts of difficulties do not arise. It does, however, keep you from searching across several different rollup indices at the same time.
The other workaround is to include an "off-target" aggregation in the query, which pulls in the "superset" job and corrects the doc counts.
The RollupSearch API determines the best job to search for each "leaf node" in the aggregation tree. So if we include a metric agg on price
,
which was only defined in the sensor-all
job, that will "pull in" the other job:
"aggs" : { "nodes": { "terms": { "field": "node" }, "aggs": { "building": { "terms": { "field": "building" } }, "avg_price": { "avg": { "field": "price" } } } } }
Because only sensor-all
job had an avg
on the price field, the RollupSearch API is forced to pull in that additional job for searching,
and will merge/correct the doc_counts as appropriate. This sort of workaround applies to any additional aggregation — metric or bucketing — although it can be tedious to look through the jobs and determine the right one to add.
Status
editWe realize this is an onerous limitation, and somewhat breaks the rollup contract of "pick the fields to rollup, we do the rest". We are
actively working to get the limitation to composite
agg fixed, and the related issues in Rollup. The documentation will be updated when
the fix is implemented.