Writing aggregations
editWriting aggregations
editNEST allows you to write your aggregations using
- a strict fluent DSL
- a verbatim object initializer syntax that maps verbatim to the Elasticsearch API
- a more terse object initializer aggregation DSL
Three different ways, yikes that’s a lot to take in! Let’s go over them one at a time and explain when you might want to use each.
This is the json output for each example
{ "aggs": { "name_of_child_agg": { "children": { "type": "commits" }, "aggs": { "average_per_child": { "avg": { "field": "confidenceFactor" } }, "max_per_child": { "max": { "field": "confidenceFactor" } }, "min_per_child": { "min": { "field": "confidenceFactor" } } } } } }
Fluent DSL
editThe fluent lambda syntax is the most terse way to write aggregations. It benefits from types that are carried over to sub aggregations
Fluent DSL example
edits => s .Aggregations(aggs => aggs .Children<CommitActivity>("name_of_child_agg", child => child .Aggregations(childAggs => childAggs .Average("average_per_child", avg => avg.Field(p => p.ConfidenceFactor)) .Max("max_per_child", avg => avg.Field(p => p.ConfidenceFactor)) .Min("min_per_child", avg => avg.Field(p => p.ConfidenceFactor)) ) ) )
Object Initializer syntax
editThe object initializer syntax (OIS) is a one-to-one mapping with how aggregations have to be represented in the Elasticsearch API. While it has the benefit of being a one-to-one mapping, being dictionary based in C# means it can grow verbose rather quickly.
Here are the same aggregations as expressed in the Fluent API above, with the dictionary-based object initializer syntax
Object Initializer syntax example
editnew SearchRequest<Project> { Aggregations = new AggregationDictionary { { "name_of_child_agg", new ChildrenAggregation("name_of_child_agg", typeof(CommitActivity)) { Aggregations = new AggregationDictionary { {"average_per_child", new AverageAggregation("average_per_child", "confidenceFactor")}, {"max_per_child", new MaxAggregation("max_per_child", "confidenceFactor")}, {"min_per_child", new MinAggregation("min_per_child", "confidenceFactor")}, } } } } }
As you can see, the key in the dictionary is repeated as the name passed to the aggregation constructor. This starts to get hard to read and a little error prone, wouldn’t you agree?
There is a better way however…
Terse Object Initializer syntax
editThe Object Initializer syntax can be shortened dramatically by using *Aggregation
types directly,
allowing you to forego the need to introduce intermediary dictionaries to represent the aggregation DSL.
In using these, it is also possible to combine multiple aggregations using the bitwise &&
operator.
Compare the following example with the previous vanilla Object Initializer syntax
Object Initializer syntax example
editnew SearchRequest<Project> { Aggregations = new ChildrenAggregation("name_of_child_agg", typeof(CommitActivity)) { Aggregations = new AverageAggregation("average_per_child", Field<CommitActivity>(p => p.ConfidenceFactor)) && new MaxAggregation("max_per_child", Field<CommitActivity>(p => p.ConfidenceFactor)) && new MinAggregation("min_per_child", Field<CommitActivity>(p => p.ConfidenceFactor)) } }
Now that’s much cleaner! Assigning an *Aggregation
type directly to the Aggregation
property
on a search request works because there are implicit conversions within NEST to handle this for you.
Mixed usage of object initializer and fluent
editSometimes its useful to mix and match fluent and object initializer, the fluent Aggregations method therefore
also accepts AggregationDictionary
directly.
Fluent DSL example
edits => s .Aggregations(new ChildrenAggregation("name_of_child_agg", typeof(CommitActivity)) { Aggregations = new AverageAggregation("average_per_child", Field<CommitActivity>(p => p.ConfidenceFactor)) && new MaxAggregation("max_per_child", Field<CommitActivity>(p => p.ConfidenceFactor)) && new MinAggregation("min_per_child", Field<CommitActivity>(p => p.ConfidenceFactor)) })
Binary operators off the same descriptor
editFor dynamic aggregation building using the fluent syntax,
it can be useful to abstract the construction to methods as much as possible.
You can use the binary operator &&
on the same aggregation descriptor to compose the graph.
Each side of the binary operation can return null dynamically.
s => s .Aggregations(aggs => aggs .Children<CommitActivity>("name_of_child_agg", child => child .Aggregations(Combine) ) )
Returning a different AggregationContainer in fluent syntax
editAll the fluent selector expects is an IAggregationContainer
to be returned. You could abstract this to a
method returning AggregationContainer
which is free to use the object initializer syntax
to compose that AggregationContainer
.
s => s .Aggregations(aggs => aggs .Children<CommitActivity>("name_of_child_agg", child => child .Aggregations(childAggs => Combine()) ) )
Aggregating over a collection of aggregations
editAn advanced scenario may involve an existing collection of aggregation functions that should be set as aggregations
on the request. Using LINQ’s .Aggregate()
method, each function can be applied to the aggregation descriptor
(childAggs
below) in turn, returning the descriptor after each function application.
var aggregations = new List<Func<AggregationContainerDescriptor<CommitActivity>, IAggregationContainer>> { a => a.Average("average_per_child", avg => avg.Field(p => p.ConfidenceFactor)), a => a.Max("max_per_child", avg => avg.Field(p => p.ConfidenceFactor)), a => a.Min("min_per_child", avg => avg.Field(p => p.ConfidenceFactor)) }; return s => s .Aggregations(aggs => aggs .Children<CommitActivity>("name_of_child_agg", child => child .Aggregations(childAggs => aggregations.Aggregate(childAggs, (acc, agg) => { agg(acc); return acc; }) ) ) );
a list of aggregation functions to apply |
|
Using LINQ’s |
Handling responses
editThe SearchResponse
exposes an AggregateDictionary
which is specialized dictionary over <string, IAggregate>
that also
exposes handy helper methods that automatically cast IAggregate
to the expected aggregate response.
Let’s see this in action:
a => a .Children<CommitActivity>("name_of_child_agg", child => child .Aggregations(childAggs => childAggs .Average("average_per_child", avg => avg.Field(p => p.ConfidenceFactor)) .Max("max_per_child", avg => avg.Field(p => p.ConfidenceFactor)) .Min("min_per_child", avg => avg.Field(p => p.ConfidenceFactor)) ) )
Now, using .Aggregations
, we can easily get the Children
aggregation response out and from that,
the Average
and Max
sub aggregations.