OOM on date_histogram with small interval #72619

nik9000 · 2021-05-03T12:38:15Z

I recently merged #72081 which protects against OOM in the reduce phase for date_histograms. A discuss user reported having a similar issue, but they seem to have it on the data nodes while building results.

Elasticsearch version (bin/elasticsearch --version): Reported on 7.9 - @nik9000 thinks it should be possible to reproduce against master

Steps to reproduce:

We just got the stack trace in the linked discuss issue. Looks like they have a wide range and tight interval. They have min_doc_count set to 0 but I don't think that matters too much here. I'd try setting one second bins a hundred thousand docs all in a different second. The trick, I think, is not to run out of memory when collecting the agg - we have protections there - but to run out of memory when building bloaty result objects we send back to the coordinating node.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2021-05-03T12:38:17Z

Pinging @elastic/es-analytics-geo (Team:Analytics)

sag-tobias-frey · 2021-05-04T12:14:06Z

Bash script to reproduce the issue:

curl -H 'Content-type: application/json' -XPUT 'http://localhost:9200/test_index2' -d '{}'
curl -H 'Content-type: application/json' -XPUT 'http://localhost:9200/test_index2/_mapping' -d '{
 "properties": {
  "nested": {
   "type": "nested"
  }
 }
}'

for i in {0..1000}
do
  dateValue=$(($i * 100))
  date_end=$(($i * 100000))
  curl -H 'Content-type: application/json' -XPOST 'http://localhost:9200/test_index2/_doc' -d "{
	\"value\": \"A\",
	\"textValue\": $dateValue,
	\"nested\": [{
		\"date\": 0
	}, {
		\"date\": $date_end
	}]
  }"
done

sleep 5s

curl -H 'Content-type: application/json' -XPOST 'http://localhost:9200/test_index2/_search' -d '{
 "size": 0,
 "query": {
  "bool": {
   "adjust_pure_negative": true,
   "boost": 1
  }
 },
 "aggregations": {
  "agg1": {
   "terms": {
    "field": "textValue",
    "size": 2147483647,
    "min_doc_count": 0
   },
   "aggregations": {
    "agg2": {
     "terms": {
      "field": "textValue",
      "size": 2147483647,
      "min_doc_count": 0
     },
     "aggregations": {
      "activities": {
       "nested": {
        "path": "nested"
       },
       "aggregations": {
        "dateHistogram": {
         "date_histogram": {
          "field": "nested.date",
          "calendar_interval": "1M",
          "offset": 0,
          "order": {
           "_key": "asc"
          },
          "keyed": false,
          "min_doc_count": 0
         }
        }
       }
      }
     }
    }
   }
  }
 }
}'

nik9000 · 2021-05-04T12:16:34Z

Thanks @sag-tobias-frey. Its interesting that you got there with nested! I hadn't realized that might be in the mix. Fun times.

sag-tobias-frey · 2021-05-11T17:20:57Z

I have managed to get there even without nested:

curl -H 'Content-type: application/json' -XPUT 'http://localhost:9200/test_index_3' -d '{}'
curl -H 'Content-type: application/json' -XPUT 'http://localhost:9200/test_index_3/_mapping' -d '{
 "properties": {
  "nested": {
   "type": "nested"
  }
 }
}'

for i in {0..1000}
do
  dateValue=$(($i * 100))
  date_end=$(($i * 100000))
  curl -H 'Content-type: application/json' -XPOST 'http://localhost:9200/test_index_3/_doc' -d "{
	\"value\": \"A\",
	\"textValue\": $dateValue,
	\"date\": $date_end,
	\"nested\": [{
		\"date\": 0
	}, {
		\"date\": $date_end
	}]
  }"
done

sleep 5s

curl -H 'Content-type: application/json' -XPOST 'http://localhost:9200/test_index_3/_search' -d '{
  "size": 0,
  "query": {
    "bool": {
      "adjust_pure_negative": true,
      "boost": 1
    }
  },
  "aggregations": {
    "agg1": {
      "terms": {
        "field": "textValue",
        "size": 1001,
        "min_doc_count": 0
      },
      "aggregations": {
        "agg2": {
          "terms": {
            "field": "textValue",
            "size": 1001,
            "min_doc_count": 0
          },
          "aggregations": {
            "dateHistogram": {
              "date_histogram": {
                "field": "date",
                "calendar_interval": "1M",
                "offset": 0,
                "order": {
                  "_key": "asc"
                },
                "keyed": false,
                "min_doc_count": 0
              }
            }
          }
        }
      }
    }
  }
}'

salvatore-campagna · 2022-05-23T14:49:51Z

I executed some test (with 0.5GB heap) running the query against the current master branch and I see that we still have an OOM but not the one described originally . If my understanding is correct the original OOM happened on the coordinator and has been fixed #72081. That patch makes sure we hit the circuit breaker on the coordinator before the OOM takes place.

Right now, anyway, I see something different and my understanding is that the OOM is happening on the data node.
To be more precise I see the counter introduced by #72018 is never hit because the issue happens before the reduce operation is done.

What I see is that the method BucketsAggregator#buildAggregationsForVariableBuckets is called with an array long[] owningBucketOrds whose size if 1_002_001 (so slightly more than 1M entries) which results in creating several objects of type InternalDateHistogram (the result of running a date histogram aggregation).

According to the heapdump these objects (InternalDateHistogram, LongTerms.Bucket, InternalNested, InternalDateHistogram.EmptyBucketInfo) take more than 40% of the heap.

Increasing the heap to 2GB I get the following response (no OOM)

{
  "error": {
    "root_cause": [],
    "type": "search_phase_execution_exception",
    "reason": "",
    "phase": "fetch",
    "grouped": true,
    "failed_shards": [],
    "caused_by": {
      "type": "too_many_buckets_exception",
      "reason": "Trying to create too many buckets. Must be less than or equal to: [65536] but this number of buckets was exceeded. This limit can be set by changing the [search.max_buckets] cluster level setting.",
      "max_buckets": 65536
    }
  },
  "status": 503
}

NOTE: the original query is actually calculating a cross product on the textValue field as a result of having two nested (numeric) terms queries and a nested date histogram. There are 1000 distinct numeric terms which considering the cross product results in more than 1M buckets. Each of these buckets holds the result for a data histogram. So, in the end the query is returning 1M (empty) date histograms.

sag-tobias-frey · 2022-05-23T14:59:44Z

NOTE: the original query is actually calculating a cross product on the textValue field. There are 1000 distinct numeric terms which considering the cross product results in more than 1M buckets. Each of these buckets holds the result for a data histogram. So, in the end the query is returning 1M (empty) date histograms.

However, shouldn't the number of resulting buckets from the cross product of textValue be 1000 because that is the same field which always has the same value? In the end, the date histogram pushes it over the max buckets anyway but the first aggregations should be fine.

Have you tried sending multiple of these requests in parallel against the 2GB cluster? We noticed that we have to be careful with these kind of aggregation / distributions when we have parallel requests because then the OOM error might still occur with more heap occur because the circuit breaker does not detect it early enough.

salvatore-campagna · 2022-05-23T15:16:04Z

NOTE: the original query is actually calculating a cross product on the textValue field. There are 1000 distinct numeric terms which considering the cross product results in more than 1M buckets. Each of these buckets holds the result for a data histogram. So, in the end the query is returning 1M (empty) date histograms.

However, shouldn't the number of resulting buckets from the cross product of textValue be 1000 because that is the same field which always has the same value? In the end, the date histogram pushes it over the max buckets anyway but the first aggregations should be fine.

Have you tried sending multiple of these requests in parallel against the 2GB cluster? We noticed that we have to be careful with these kind of aggregation / distributions when we have parallel requests because then the OOM error might still occur with more heap occur because the circuit breaker does not detect it early enough.

Regarding the cross product my understanding is different. If we have a three documents each with textValue equal to 1, 2, 3 the result will be something like

(1, 1, date_histo)
(1, 2, date_histo)
(1, 3, date_histo)
...
(3, 2, date_histo)
(3, 3, date_histo)

Extending it to 1000 distinct values results in 1M buckets.

Anyway, yes the problem is that the circuit breaker is not firing but I think this is not happening because the creation of objects like InternalDateHistogram is not accounted for...or is not checked early enough.

sag-tobias-frey · 2022-05-24T15:18:40Z

Removed

salvatore-campagna · 2022-06-06T08:13:14Z

I had a discussion with the team about this issue and the agreement is that it needs to be addresses by the following two issues:

Make sure all significant memory usage in aggs are tracked in BigArrays #59892: most of our aggregations use data structures like lists and arrays which are not tracked using the BigArrays abstraction. As a result of this, we use some memory without accounting for it and this causes OOMs to happen before a circuit breaker fires.
Dense representation for aggs #77449: this is more a consequence since different data structures which are not accounted for in memory consumption are also serialised to the wire format. As a result, we need to come up with a compact representation to avoid large network traffic.

salvatore-campagna · 2022-06-06T09:10:30Z

The list of objects taking more space is the following:

byte[]
InternalDateHistogram
BucketsAggregator$1
LongTerms$Bucket
InternalAggregations
InternalNested
InternalDateHistogram$EmptyBucketInfo

Attaching a script which triggers the issue (test.txt).
NOTE: you need to run Elasticsearch with a small heap to see the OOM. I used 512M.

test.txt

nik9000 · 2022-06-06T12:56:29Z

Just an update for posterity/those following along at home - this is mostly #77449 - our response objects are super wasteful and sometimes allocate so quickly the real memory breaker doesn't catch them. A dense representation would save us here. And help lots of other things.

In the short run, I expect we could save some heap by reworking how EmptyBucketInfo works. But I think cutting aggs over to a dense representation is probably a good thing anyway.

elasticsearchmachine · 2024-07-03T18:49:44Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

nik9000 added >bug :Analytics/Aggregations Aggregations labels May 3, 2021

elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label May 3, 2021

csoulios self-assigned this May 12, 2021

nik9000 mentioned this issue May 26, 2021

IndexOutOfBounds exception on simple query #72950

Closed

salvatore-campagna assigned salvatore-campagna and unassigned csoulios May 20, 2022

not-napoleon mentioned this issue Jun 28, 2022

InternalAggregations lack memory accounting #88128

Open

not-napoleon mentioned this issue Aug 17, 2022

Enable Circuit Breaker tracking in more parts of the aggregations framework #89437

Open

34 tasks

not-napoleon assigned not-napoleon and unassigned salvatore-campagna Sep 1, 2022

not-napoleon removed their assignment Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OOM on date_histogram with small interval #72619

OOM on date_histogram with small interval #72619

nik9000 commented May 3, 2021

elasticmachine commented May 3, 2021

sag-tobias-frey commented May 4, 2021

nik9000 commented May 4, 2021

sag-tobias-frey commented May 11, 2021

salvatore-campagna commented May 23, 2022 •

edited

Loading

sag-tobias-frey commented May 23, 2022

salvatore-campagna commented May 23, 2022 •

edited

Loading

sag-tobias-frey commented May 24, 2022 •

edited

Loading

salvatore-campagna commented Jun 6, 2022

salvatore-campagna commented Jun 6, 2022 •

edited

Loading

nik9000 commented Jun 6, 2022

elasticsearchmachine commented Jul 3, 2024

OOM on date_histogram with small interval #72619

OOM on date_histogram with small interval #72619

Comments

nik9000 commented May 3, 2021

elasticmachine commented May 3, 2021

sag-tobias-frey commented May 4, 2021

nik9000 commented May 4, 2021

sag-tobias-frey commented May 11, 2021

salvatore-campagna commented May 23, 2022 • edited Loading

sag-tobias-frey commented May 23, 2022

salvatore-campagna commented May 23, 2022 • edited Loading

sag-tobias-frey commented May 24, 2022 • edited Loading

salvatore-campagna commented Jun 6, 2022

salvatore-campagna commented Jun 6, 2022 • edited Loading

nik9000 commented Jun 6, 2022

elasticsearchmachine commented Jul 3, 2024

salvatore-campagna commented May 23, 2022 •

edited

Loading

salvatore-campagna commented May 23, 2022 •

edited

Loading

sag-tobias-frey commented May 24, 2022 •

edited

Loading

salvatore-campagna commented Jun 6, 2022 •

edited

Loading