Profiling Aggregations
editProfiling Aggregations
editaggregations
Section
editThe aggregations
section contains detailed timing of the aggregation tree executed by a particular shard.
The overall structure of this aggregation tree will resemble your original Elasticsearch request. Let’s
execute the previous query again and look at the aggregation profile this time:
GET /twitter/_search { "profile": true, "query": { "term": { "user": { "value": "test" } } }, "aggs": { "my_scoped_agg": { "terms": { "field": "likes" } }, "my_global_agg": { "global": {}, "aggs": { "my_level_agg": { "terms": { "field": "likes" } } } } }, "post_filter": { "match": { "message": "some" } } }
This yields the following aggregation profile output:
{ "profile" : { "shards" : [ { "aggregations" : [ { "type" : "LongTermsAggregator", "description" : "my_scoped_agg", "time_in_nanos" : 195386, "breakdown" : { "reduce" : 0, "build_aggregation" : 81171, "build_aggregation_count" : 1, "initialize" : 22753, "initialize_count" : 1, "reduce_count" : 0, "collect" : 91456, "collect_count" : 4 } }, { "type" : "GlobalAggregator", "description" : "my_global_agg", "time_in_nanos" : 190430, "breakdown" : { "reduce" : 0, "build_aggregation" : 59990, "build_aggregation_count" : 1, "initialize" : 29619, "initialize_count" : 1, "reduce_count" : 0, "collect" : 100815, "collect_count" : 4 }, "children" : [ { "type" : "LongTermsAggregator", "description" : "my_level_agg", "time_in_nanos" : 160329, "breakdown" : { "reduce" : 0, "build_aggregation" : 55712, "build_aggregation_count" : 1, "initialize" : 10559, "initialize_count" : 1, "reduce_count" : 0, "collect" : 94052, "collect_count" : 4 } } ] } ] } ] } }
From the profile structure we can see that the my_scoped_agg
is internally being run as a LongTermsAggregator
(because the field it is
aggregating, likes
, is a numeric field). At the same level, we see a GlobalAggregator
which comes from my_global_agg
. That
aggregation then has a child LongTermsAggregator
which comes from the second term’s aggregation on likes
.
The time_in_nanos
field shows the time executed by each aggregation, and is inclusive of all children. While the overall time is useful,
the breakdown
field will give detailed stats about how the time was spent.
Timing Breakdown
editThe breakdown
component lists detailed timing statistics about low-level Lucene execution:
"breakdown": { "reduce": 0, "reduce_count": 0, "build_aggregation": 49765, "build_aggregation_count": 300, "initialize": 52785, "initialize_count": 300, "reduce_count": 0, "collect": 3155490036, "collect_count": 1800 }
Timings are listed in wall-clock nanoseconds and are not normalized at all. All caveats about the overall
time
apply here. The intention of the breakdown is to give you a feel for A) what machinery in Elasticsearch is
actually eating time, and B) the magnitude of differences in times between the various components. Like the overall time,
the breakdown is inclusive of all children times.
The meaning of the stats are as follows:
All parameters:
edit
|
This times how long it takes to create and initialise the aggregation before starting to collect documents. |
|
This represents the cumulative time spent in the collect phase of the aggregation. This is where matching documents are passed to the aggregation and the state of the aggregator is updated based on the information contained in the documents. |
|
This represents the time spent creating the shard level results of the aggregation ready to pass back to the reducing node after the collection of documents is finished. |
|
This is not currently used and will always report |
|
Records the number of invocations of the particular method. For example, |