Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-44115][BUILD] Upgrade Apache ORC to 2.0.0 #45443

Closed
wants to merge 1 commit into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Mar 8, 2024

What changes were proposed in this pull request?

This PR aims to Upgrade Apache ORC to 2.0.0 for Apache Spark 4.0.0.

Apache ORC community has 3-year support policy which is longer than Apache Spark. It's aligned like the following.

  • Apache ORC 2.0.x <-> Apache Spark 4.0.x
  • Apache ORC 1.9.x <-> Apache Spark 3.5.x
  • Apache ORC 1.8.x <-> Apache Spark 3.4.x
  • Apache ORC 1.7.x (Supported) <-> Apache Spark 3.3.x (End-Of-Support)

Why are the changes needed?

Release Note

Milestone

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass the CIs.

Was this patch authored or co-authored using generative AI tooling?

No.

@dongjoon-hyun
Copy link
Member Author

This is tested during Apache ORC 2.0.0 RC0 vote.

And, this PR passed here again.
Screenshot 2024-03-08 at 12 10 34

@dongjoon-hyun
Copy link
Member Author

Could you review this PR when you have some time, please, @viirya ?

Comment on lines +2596 to +2602
<dependency>
<groupId>org.apache.orc</groupId>
<artifactId>orc-format</artifactId>
<version>1.0.0</version>
<classifier>${orc.classifier}</classifier>
<scope>${orc.deps.scope}</scope>
</dependency>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, why do we need to exclude orc-format from orc-core and add it as a separate dependency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for review. It's because Apache Spark needs to control orc.classifier here. Without this, the default dependencies are pulled in.

@dongjoon-hyun
Copy link
Member Author

Thank you so much, @viirya !

Merged to master for Apache Spark 4.0.0.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-44115 branch March 8, 2024 21:36
@yaooqinn
Copy link
Member

Late +1 & LGTM. Thank you @dongjoon-hyun

sweisdb pushed a commit to sweisdb/spark that referenced this pull request Apr 1, 2024
### What changes were proposed in this pull request?

This PR aims to Upgrade Apache ORC to 2.0.0 for Apache Spark 4.0.0.

Apache ORC community has 3-year support policy which is longer than Apache Spark. It's aligned like the following.
- Apache ORC 2.0.x <-> Apache Spark 4.0.x
- Apache ORC 1.9.x <-> Apache Spark 3.5.x
- Apache ORC 1.8.x <-> Apache Spark 3.4.x
- Apache ORC 1.7.x (Supported) <-> Apache Spark 3.3.x (End-Of-Support)

### Why are the changes needed?

**Release Note**
- https://github.com/apache/orc/releases/tag/v2.0.0

**Milestone**
- https://github.com/apache/orc/milestone/20?closed=1
  - apache/orc#1728
  - apache/orc#1801
  - apache/orc#1498
  - apache/orc#1627
  - apache/orc#1497
  - apache/orc#1509
  - apache/orc#1554
  - apache/orc#1708
  - apache/orc#1733
  - apache/orc#1760
  - apache/orc#1743

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#45443 from dongjoon-hyun/SPARK-44115.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants