-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] host memory Leak in MultiFileCoalescingPartitionReaderBase in UTC time zone #9974
Comments
Good news: this exact stack trace associated with $ SPARK_HOME=~/dist/spark-3.1.1-bin-hadoop3.2 \
JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \
TEST_PARALLEL=0 COVERAGE_SUBMIT_FLAGS="-Dai.rapids.refcount.debug=true" \
TZ= ./integration_tests/run_pyspark_from_build.sh -s |& tee parquet.log
$ grep -B 20 MultiFileCoalescingPartitionReaderBase parquet.log | grep -c Leaked
0 Bad news, there are memory leaks reported: $ grep -c Leaked parquet.log
76 Example:
|
There is a single test reliably reproducing the new memory leak SPARK_HOME=~/dist/spark-3.1.1-bin-hadoop3.2 \
JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \
TEST_PARALLEL=0 \
COVERAGE_SUBMIT_FLAGS="-Dai.rapids.refcount.debug=true" \
TZ= ./integration_tests/run_pyspark_from_build.sh -k test_json_tuple_select_non_generator_col -s |& tee build.log which is already responsible for 20 Leaked buffers . It reproduces on my laptop and desktop GPU, so pretty sure it's on the CI log and would have been caught by #6947 had it already been implemented. |
bisect points to #10131 |
The memory leak from #9974 (comment) was fixed by #10360 @revans2 @sameerz I assume we don't want to release 24.02.0 like this and the fix should be cherry-picked to branch-24.02, right? |
The original memory leaks reported in this issue's description cannot be reproduced. Closing |
Describe the bug
host memory Leak in MultiFileCoalescingPartitionReaderBase
Steps/Code to reproduce bug
Errors:
Environment details (please complete the following information)
On branch-24.02
Spark311
Additional context
I guess branch-23.12 also has this issue, Just guess, I did not test yet.
I also found this leak issue: #9971
From the recently passed CI log, we can also found the following error, CI sequence number is #8653.
Click Blue Ocean -> Premerge CI 2 -> Download log
The text was updated successfully, but these errors were encountered: