-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Save memory when numeric terms agg is not top #55873
Conversation
Right now all implementations of the `terms` agg allocate a new `Aggregator` per bucket. This uses a bunch of memory. Exactly how much isn't clear but each `Aggregator` ends up making its own objects to read doc values which have non-trivial buffers. And it forces all of it sub-aggregations to do the same. We allocate a new `Aggregator` per bucket for two reasons: 1. We didn't have an appropriate data structure to track the sub-ordinals of each parent bucket. 2. You can only make a single call to `runDeferredCollections(long...)` per `Aggregator` which was the only way to delay collection of sub-aggregations. This change adds a way to return "deferred" aggregations from any bucket aggregation and undefers them as part of regular collections. This mechanism allows you to defer without `runDeferredCollections(long...)`. This change also adds a fairly simplistic data structure to track the sub-ordinals for `long`-keyed buckets. It uses both of those to power numeric `terms` aggregations and removes the per-bucket allocation of their `Aggregator`. This fairly substantially reduces memory consumption of numeric `terms` aggregations that are not the "top level", especially when those aggregations contain many sub-aggregations. I picked numeric `terms` aggregations because those have the simplest implementation. At least, I could kind of fit it in my head. And I haven't fully understood the "bytes"-based terms aggregations, but I imagine I'll be able to make similar optimizations to them in follow up changes.
@elasticmachine, test this pleas |
Adds a rally track specifically for testing the performance of the terms agg as I'll be doing some work on it. In particular this focuses on numeric terms because the first phase of my work only touches them. Relates to elastic/elasticsearch#55873
This looks like it might actually speed up nested numeric terms.
I kind of figured that it'd be a little faster. But these numbers are more than I'd though.... They look real though. |
@@ -158,19 +158,27 @@ public final InternalAggregation reducePipelines( | |||
|
|||
@Override | |||
public InternalAggregation copyWithRewritenBuckets(Function<InternalAggregations, InternalAggregations> rewriter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can revert this.
I've rebuilt this, replacing delaying building results with reworking agg results to be built for the entire aggregator at once. I've push some code that looks to mostly work and will be cleaning it up soon! |
@@ -174,6 +373,9 @@ public Aggregator resolveSortPath(AggregationPath.PathElement next, Iterator<Agg | |||
|
|||
@Override | |||
public BucketComparator bucketComparator(String key, SortOrder order) { | |||
if (false == this instanceof SingleBucketAggregator) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was missing from some work that I did previously and removing the wrapping of LongTermsAggregator
revealed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this we get strange test failures around sorting.
Adds a rally track specifically for testing the performance of the bucket aggs when they are "sub" aggs. Relates to elastic/elasticsearch#55873
Adds a rally track specifically for testing the performance of the bucket aggs when they are "sub" aggs. Relates to elastic/elasticsearch#55873
Adds a rally track specifically for testing the performance of the bucket aggs when they are "sub" aggs. Relates to elastic/elasticsearch#55873
Reworks the `parent` and `child` aggregation are not at the top level using the optimization from elastic#55873. Instead of wrapping all non-top-level `parent` and `child` aggregators we now handle being a child aggregator in the aggregator, specifically by adding recording which global ordinals show up in the parent and then checking if they match the child.
Reworks the `parent` and `child` aggregation are not at the top level using the optimization from #55873. Instead of wrapping all non-top-level `parent` and `child` aggregators we now handle being a child aggregator in the aggregator, specifically by adding recording which global ordinals show up in the parent and then checking if they match the child.
Reworks the `parent` and `child` aggregation are not at the top level using the optimization from elastic#55873. Instead of wrapping all non-top-level `parent` and `child` aggregators we now handle being a child aggregator in the aggregator, specifically by adding recording which global ordinals show up in the parent and then checking if they match the child.
Reworks the `parent` and `child` aggregation are not at the top level using the optimization from #55873. Instead of wrapping all non-top-level `parent` and `child` aggregators we now handle being a child aggregator in the aggregator, specifically by adding recording which global ordinals show up in the parent and then checking if they match the child.
This uses the optimization that we started making in elastic#55873 for `rare_terms` to save a bit of memory when that aggregation is not on the top level.
This uses the optimization that we started making in #55873 for `rare_terms` to save a bit of memory when that aggregation is not on the top level.
This uses the optimization that we started making in elastic#55873 for `rare_terms` to save a bit of memory when that aggregation is not on the top level.
This merges the aggregator for `significant_text` into `significant_terms`, applying the optimization built in elastic#55873 to save memory when the aggregation is not on top. The `significant_text` aggregation is pretty memory intensive all on its own and this doesn't particularly help with that, but it'll help with the memory usage of any sub-aggregations.
This merges the aggregator for `significant_text` into `significant_terms`, applying the optimization built in #55873 to save memory when the aggregation is not on top. The `significant_text` aggregation is pretty memory intensive all on its own and this doesn't particularly help with that, but it'll help with the memory usage of any sub-aggregations.
This merges the aggregator for `significant_text` into `significant_terms`, applying the optimization built in elastic#55873 to save memory when the aggregation is not on top. The `significant_text` aggregation is pretty memory intensive all on its own and this doesn't particularly help with that, but it'll help with the memory usage of any sub-aggregations.
This merges the aggregator for `significant_text` into `significant_terms`, applying the optimization built in #55873 to save memory when the aggregation is not on top. The `significant_text` aggregation is pretty memory intensive all on its own and this doesn't particularly help with that, but it'll help with the memory usage of any sub-aggregations.
Right now all implementations of the
terms
agg allocate a newAggregator
per bucket. This uses a bunch of memory. Exactly how muchisn't clear but each
Aggregator
ends up making its own objects to readdoc values which have non-trivial buffers. And it forces all of it
sub-aggregations to do the same. We allocate a new
Aggregator
perbucket for two reasons:
sub-ordinals of each parent bucket.
runDeferredCollections(long...)
per
Aggregator
which was the only way to delay collection ofsub-aggregations.
This change adds a way to return "deferred" aggregations from any
bucket aggregation and undefers them as part of regular collections.
This mechanism allows you to defer without
runDeferredCollections(long...)
.This change also adds a fairly simplistic data structure to track the
sub-ordinals for
long
-keyed buckets.It uses both of those to power numeric
terms
aggregations and removesthe per-bucket allocation of their
Aggregator
. This fairlysubstantially reduces memory consumption of numeric
terms
aggregationsthat are not the "top level", especially when those aggregations contain
many sub-aggregations.
I picked numeric
terms
aggregations because those have the simplestimplementation. At least, I could kind of fit it in my head. And I
haven't fully understood the "bytes"-based terms aggregations, but I
imagine I'll be able to make similar optimizations to them in follow up
changes.