[CI][Java] Integration jobs with Spark fail with NoSuchMethodError:io.netty.buffer.PooledByteBufAllocator #36332

raulcd · 2023-06-27T16:04:40Z

Describe the bug, including details regarding any error messages, version, and platform.

It does seem that:
#36211
Updated from PoolThreadCache to PoolArenasCache this has made our nightly integration tests with Spark previous and current development versions to fail.

The error:

 02:07:30.759 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 24.0 (TID 30) (ab33723e6432 executor driver): java.lang.NoSuchMethodError: io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.threadCache()Lio/netty/buffer/PoolArenasCache;

Spark hasn't yet updated to 4.1.94.Final.
I am unsure on how do we fix this but does this mean we break backwards compatibility with previous Spark versions?

Component(s)

Continuous Integration, Java

The text was updated successfully, but these errors were encountered:

raulcd · 2023-06-27T16:06:11Z

@BryanCutler @lidavidm @kiszk FYI
I've also opened a ticket on the JIRA for spark to let them know about the CVE.

lidavidm · 2023-06-27T16:18:13Z

Possibly we can do something via runtime reflection (though that would have to be carefully benchmarked)

kiszk · 2023-06-27T16:21:51Z

Thank you for sharing this. Here is a discussion at Spark side.

BryanCutler · 2023-06-27T17:29:32Z

Could we have our Spark tests use Arrow with shaded dependencies?

lidavidm · 2023-06-28T17:39:43Z

I think shaded Netty would be the best solution (it would also hopefully unblock downstream Spark). But we don't build such an artifact currently.

Could we force the test to use Arrow with arrow-memory-unsafe added and arrow-memory-netty excluded?

raulcd · 2023-07-04T09:32:57Z

@lidavidm @BryanCutler is this or should this be a blocker for 13.0.0?

lidavidm · 2023-07-04T11:47:54Z

I think we should evaluate if shading Netty or using arrow-memory-unsafe in place of arrow-memory-netty works, or else evaluate if something reflection-based might work.

lidavidm · 2023-07-04T11:48:10Z

Er, so that is to say, yes, let's consider this a blocker.

lidavidm · 2023-07-05T13:20:40Z

It looks like to shade Netty or use arrow-memory-unsafe, we'd have to modify the Spark pom; I'm not sure if that quite qualifies as a solution for non-HEAD Spark.

…sion upgrade

danepitkin · 2023-07-06T18:30:53Z

I sent a note to the Spark ML: https://lists.apache.org/thread/ndmj3ht85j2g40n8clfh92ny6qqbvd09

So far, I think we are leaning towards Spark resolving HEAD on their side and breaking backwards compatibility with older Spark versions.

…ur integration tests

raulcd · 2023-07-12T14:12:29Z

Thanks @danepitkin !
If this is the solution we decide to go with I suppose we can try to patch the POM to use the new Netty version (I am testing that on the related PR).
On that case this should not a blocker for the release as we will just "patch" older Spark versions (updating version on POM) to use the newer Netty version? @lidavidm

lidavidm · 2023-07-12T14:53:40Z

I think Spark will just have to be ignored.

raulcd · 2023-07-14T10:30:14Z

I never posted the JIRA ticket that was opened on SPARK. Adding it for reference: https://issues.apache.org/jira/projects/SPARK/issues/SPARK-44212

raulcd · 2023-07-28T15:07:12Z

@lidavidm this can be closed after this one #36928 has been merged, right?

lidavidm · 2023-07-28T15:08:14Z

I merged that PR - so hopefully this is fixed

lidavidm · 2023-07-28T16:31:36Z

I ran crossbow on that PR - looks like it does pass now

raulcd · 2023-07-28T16:56:33Z

Thanks @lidavidm , I am closing it then!

### What changes were proposed in this pull request? This pr upgrade Apache Arrow from 13.0.0 to 14.0.0. ### Why are the changes needed? The Apache Arrow 14.0.0 release brings a number of enhancements and bug fixes. ‎ In terms of bug fixes, the release addresses several critical issues that were causing failures in integration jobs with Spark([GH-36332](apache/arrow#36332)) and problems with importing empty data arrays([GH-37056](apache/arrow#37056)). It also optimizes the process of appending variable length vectors([GH-37829](apache/arrow#37829)) and includes C++ libraries for MacOS AARCH 64 in Java-Jars([GH-38076](apache/arrow#38076)). ‎ The new features and improvements focus on enhancing the handling and manipulation of data. This includes the introduction of DefaultVectorComparators for large types([GH-25659](apache/arrow#25659)), support for extended expressions in ScannerBuilder([GH-34252](apache/arrow#34252)), and the exposure of the VectorAppender class([GH-37246](apache/arrow#37246)). ‎ The release also brings enhancements to the development and testing process, with the CI environment now using JDK 21([GH-36994](apache/arrow#36994)). In addition, the release introduces vector validation consistent with C++, ensuring consistency across different languages([GH-37702](apache/arrow#37702)). ‎ Furthermore, the usability of VarChar writers and binary writers has been improved with the addition of extra input methods([GH-37705](apache/arrow#37705)), and VarCharWriter now supports writing from `Text` and `String`([GH-37706](apache/arrow#37706)). The release also adds typed getters for StructVector, improving the ease of accessing data([GH-37863](apache/arrow#37863)). The full release notes as follows: - https://arrow.apache.org/release/14.0.0.html ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes #43650 from LuciferYang/arrow-14. Lead-authored-by: yangjie01 <[email protected]> Co-authored-by: YangJie <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

raulcd added the Type: bug label Jun 27, 2023

github-actions bot added Component: Continuous Integration Component: Java labels Jun 27, 2023

raulcd mentioned this issue Jun 28, 2023

GH-36199: [Python][CI][Spark] Update spark versions used on our nightly tests #36347

Merged

raulcd added the Priority: Blocker Marks a blocker for the release label Jun 29, 2023

danepitkin added a commit to danepitkin/arrow that referenced this issue Jul 5, 2023

apacheGH-36332: [Java] Fix Spark integration failure due to Netty ver…

2176273

…sion upgrade

github-actions bot mentioned this issue Jul 5, 2023

[Abandoned] [Java] Spark integration failure due to Netty version #36493

Closed

github-actions bot assigned danepitkin Jul 5, 2023

danepitkin removed their assignment Jul 5, 2023

raulcd added this to the 13.0.0 milestone Jul 7, 2023

raulcd added a commit to raulcd/arrow that referenced this issue Jul 12, 2023

apacheGH-36332: [CI][Java] Patch spark to use Netty 4.1.94.Final on o…

44ec8f8

…ur integration tests

github-actions bot mentioned this issue Jul 12, 2023

GH-36332: [CI][Java] Patch spark to use Netty 4.1.94.Final on our integration tests #36640

Closed

github-actions bot assigned raulcd Jul 12, 2023

raulcd modified the milestones: 13.0.0, 14.0.0 Jul 17, 2023

raulcd removed the Priority: Blocker Marks a blocker for the release label Jul 17, 2023

raulcd closed this as completed Jul 28, 2023

LuciferYang mentioned this issue Nov 4, 2023

[SPARK-45781][BUILD] Upgrade Arrow to 14.0.0 apache/spark#43650

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI][Java] Integration jobs with Spark fail with NoSuchMethodError:io.netty.buffer.PooledByteBufAllocator #36332

[CI][Java] Integration jobs with Spark fail with NoSuchMethodError:io.netty.buffer.PooledByteBufAllocator #36332

raulcd commented Jun 27, 2023

raulcd commented Jun 27, 2023

lidavidm commented Jun 27, 2023

kiszk commented Jun 27, 2023

BryanCutler commented Jun 27, 2023

lidavidm commented Jun 28, 2023

raulcd commented Jul 4, 2023

lidavidm commented Jul 4, 2023

lidavidm commented Jul 4, 2023

lidavidm commented Jul 5, 2023

danepitkin commented Jul 6, 2023

raulcd commented Jul 12, 2023

lidavidm commented Jul 12, 2023

raulcd commented Jul 14, 2023

raulcd commented Jul 28, 2023

lidavidm commented Jul 28, 2023

lidavidm commented Jul 28, 2023

raulcd commented Jul 28, 2023

[CI][Java] Integration jobs with Spark fail with NoSuchMethodError:io.netty.buffer.PooledByteBufAllocator #36332

[CI][Java] Integration jobs with Spark fail with NoSuchMethodError:io.netty.buffer.PooledByteBufAllocator #36332

Comments

raulcd commented Jun 27, 2023

Describe the bug, including details regarding any error messages, version, and platform.

Component(s)

raulcd commented Jun 27, 2023

lidavidm commented Jun 27, 2023

kiszk commented Jun 27, 2023

BryanCutler commented Jun 27, 2023

lidavidm commented Jun 28, 2023

raulcd commented Jul 4, 2023

lidavidm commented Jul 4, 2023

lidavidm commented Jul 4, 2023

lidavidm commented Jul 5, 2023

danepitkin commented Jul 6, 2023

raulcd commented Jul 12, 2023

lidavidm commented Jul 12, 2023

raulcd commented Jul 14, 2023

raulcd commented Jul 28, 2023

lidavidm commented Jul 28, 2023

lidavidm commented Jul 28, 2023

raulcd commented Jul 28, 2023