Grouping Limitations with heterogeneous indices

edit

Grouping Limitations with heterogeneous indices

edit

There was previously a limitation in how Rollup could handle indices that had heterogeneous mappings (multiple, unrelated/non-overlapping mappings). The recommendation at the time was to configure a separate job per data "type". For example, you might configure a separate job for each Beats module that you had enabled (one for process, another for filesystem, etc).

This recommendation was driven by internal implementation details that caused document counts to be potentially incorrect if a single "merged" job was used.

This limitation has since been alleviated. As of 6.4.0, it is now considered best practice to combine all rollup configurations into a single job.

As an example, if your index has two types of documents:

{
  "timestamp": 1516729294000,
  "temperature": 200,
  "voltage": 5.2,
  "node": "a"
}

and

{
  "timestamp": 1516729294000,
  "price": 123,
  "title": "Foo"
}

the best practice is to combine them into a single rollup job which covers both of these document types, like this:

PUT _rollup/job/combined
{
    "index_pattern": "data-*",
    "rollup_index": "data_rollup",
    "cron": "*/30 * * * * ?",
    "page_size" :1000,
    "groups" : {
      "date_histogram": {
        "field": "timestamp",
        "interval": "1h",
        "delay": "7d"
      },
      "terms": {
        "fields": ["node", "title"]
      }
    },
    "metrics": [
        {
            "field": "temperature",
            "metrics": ["min", "max", "sum"]
        },
        {
            "field": "price",
            "metrics": ["avg"]
        }
    ]
}