Prevent `date_histogram` from OOMing #72081

nik9000 · 2021-04-22T12:03:20Z

This prevents the date_histogram from running out of memory allocating
empty buckets when you set the interval to something tiny like seconds
and aggregate over a very wide date range. Without this change we'd
allocate memory very quickly and throw and out of memory error, taking
down the node. With it we instead throw the standard "too many buckets"
error.

Relates to #71758

elasticmachine · 2021-04-22T12:03:24Z

Pinging @elastic/es-analytics-geo (Team:Analytics)

nik9000 · 2021-04-22T12:04:12Z

Like #71758 this one has been here for a while, seems fairly rare, but has a fairly bad effect.

This prevents the `date_histogram` from running out of memory allocating empty buckets when you set the interval to something tiny like `seconds` and aggregate over a very wide date range. Without this change we'd allocate memory very quickly and throw and out of memory error, taking down the node. With it we instead throw the standard "too many buckets" error. Relates to elastic#71758

nik9000 · 2021-04-22T13:19:23Z

Ouch, checkstyle.

not-napoleon

LGTM

imotov

LGTM.

imotov · 2021-04-23T20:16:18Z

rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.aggregation/10_histogram.yml

+"Tiny tiny tiny date_range":
+  - skip:
+      version: " - 7.99.99"
+      reason:  fixed in 8.0 and being backported to 7.13.0


Probably too late for 7.13.0 now

Yeah, probably.

imotov · 2021-04-23T20:17:47Z

.../main/java/org/elasticsearch/search/aggregations/bucket/histogram/InternalDateHistogram.java

+         * this quickly in pathological cases and plenty large to keep the
+         * overhead minimal.
+         */
+        int reportEmptyEvery = 10000;


Should this be a constant?

Probably. I stuck it here so the comment above it could describe why it has the value it does. I'm not sure the right way to do that if its a constant without making it harder to read.

Can you at least make it final? My first reaction was "Why and where is he changing it?".

This prevents the `date_histogram` from running out of memory allocating empty buckets when you set the interval to something tiny like `seconds` and aggregate over a very wide date range. Without this change we'd allocate memory very quickly and throw and out of memory error, taking down the node. With it we instead throw the standard "too many buckets" error. Relates to elastic#71758

This prevents the `date_histogram` from running out of memory allocating empty buckets when you set the interval to something tiny like `seconds` and aggregate over a very wide date range. Without this change we'd allocate memory very quickly and throw and out of memory error, taking down the node. With it we instead throw the standard "too many buckets" error. Relates to #71758

Now that elastic#72081 has landed in the 7.x branch we can run its test in the backwards compatibility test suite.

Now that #72081 has landed in the 7.x branch we can run its test in the backwards compatibility test suite.

nik9000 added >bug :Analytics/Aggregations Aggregations v8.0.0 v7.14.0 labels Apr 22, 2021

nik9000 requested review from imotov and not-napoleon April 22, 2021 12:03

elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Apr 22, 2021

not-napoleon approved these changes Apr 22, 2021

View reviewed changes

nik9000 added 2 commits April 22, 2021 11:19

Oh checkstyle

72991e1

Merge branch 'master' into date_histogram_no_crash

d35d753

imotov approved these changes Apr 23, 2021

View reviewed changes

nik9000 added 2 commits April 27, 2021 12:51

Merge branch 'master' into date_histogram_no_crash

d7c7635

Constants are easier to read because you expect them not to change

63b610d

nik9000 merged commit 5f281ce into elastic:master Apr 27, 2021

nik9000 added the backport pending label Apr 27, 2021

nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Apr 28, 2021

Update skip after backport

71e7e55

Now that elastic#72081 has landed in the 7.x branch we can run its test in the backwards compatibility test suite.

nik9000 mentioned this pull request Apr 28, 2021

Update skip after backport #72381

Merged

nik9000 removed the backport pending label Apr 28, 2021

nik9000 added a commit that referenced this pull request Apr 28, 2021

Update skip after backport (#72381)

b31dba5

Now that #72081 has landed in the 7.x branch we can run its test in the backwards compatibility test suite.

nik9000 mentioned this pull request May 3, 2021

OOM on date_histogram with small interval #72619

Open

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent `date_histogram` from OOMing #72081

Prevent `date_histogram` from OOMing #72081

nik9000 commented Apr 22, 2021

elasticmachine commented Apr 22, 2021

nik9000 commented Apr 22, 2021

nik9000 commented Apr 22, 2021

not-napoleon left a comment

imotov left a comment

imotov Apr 23, 2021

nik9000 Apr 23, 2021

imotov Apr 23, 2021

nik9000 Apr 23, 2021

imotov Apr 23, 2021

nik9000 Apr 27, 2021

Prevent date_histogram from OOMing #72081

Prevent date_histogram from OOMing #72081

Conversation

nik9000 commented Apr 22, 2021

elasticmachine commented Apr 22, 2021

nik9000 commented Apr 22, 2021

nik9000 commented Apr 22, 2021

not-napoleon left a comment

Choose a reason for hiding this comment

imotov left a comment

Choose a reason for hiding this comment

imotov Apr 23, 2021

Choose a reason for hiding this comment

nik9000 Apr 23, 2021

Choose a reason for hiding this comment

imotov Apr 23, 2021

Choose a reason for hiding this comment

nik9000 Apr 23, 2021

Choose a reason for hiding this comment

imotov Apr 23, 2021

Choose a reason for hiding this comment

nik9000 Apr 27, 2021

Choose a reason for hiding this comment

Prevent `date_histogram` from OOMing #72081

Prevent `date_histogram` from OOMing #72081