IMPORTANT: No additional bug fixes or documentation updates
will be released for this version. For the latest information, see the
current release documentation.
Significant Text Aggregation Usage
editSignificant Text Aggregation Usage
editAn aggregation that returns interesting or unusual occurrences of free-text terms in a set. It is like the significant terms aggregation but differs in that:
-
It is specifically designed for use on type
text
fields - It does not require field data or doc-values
- It re-analyzes text content on-the-fly meaning it can also filter duplicate sections of noisy text that otherwise tend to skew statistics.
Re-analyzing large result sets will require a lot of time and memory. It is recommended that the significant_text aggregation is used as a child of either the sampler or diversified sampler aggregation to limit the analysis to a small selection of top-matching documents e.g. 200. This will typically improve speed, memory use and quality of results.
See the Elasticsearch documentation on significant text aggregation for more detail.
Fluent DSL example
edita => a .SignificantText("significant_descriptions", st => st .Field(p => p.Description) .FilterDuplicateText() )
Object Initializer syntax example
editnew SignificantTextAggregation("significant_descriptions") { Field = Infer.Field<Project>(p => p.Description), FilterDuplicateText = true }
Example json output.
{ "significant_descriptions": { "significant_text": { "field": "description", "filter_duplicate_text": true } } }
Handling Responses
editresponse.ShouldBeValid(); var sigNames = response.Aggregations.SignificantText("significant_descriptions"); sigNames.Should().NotBeNull(); sigNames.DocCount.Should().BeGreaterThan(0); foreach (var bucket in sigNames.Buckets) { bucket.Key.Should().NotBeNullOrEmpty(); bucket.BgCount.Should().BeGreaterThan(0); bucket.DocCount.Should().BeGreaterThan(0); bucket.Score.Should().BeGreaterThan(0); }