Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-42252][CORE] Add spark.shuffle.localDisk.file.output.buffer and deprecate spark.shuffle.unsafe.file.output.buffer #39819

Closed
wants to merge 11 commits into from

Conversation

wayneguow
Copy link
Contributor

@wayneguow wayneguow commented Jan 31, 2023

What changes were proposed in this pull request?

Deprecate spark.shuffle.unsafe.file.output.buffer and add a new config spark.shuffle.localDisk.file.output.buffer instead.

Why are the changes needed?

The old config is desgined to be used in UnsafeShuffleWriter, but now it has been used in all local shuffle writers through LocalDiskShuffleMapOutputWriter, introduced by #25007.

Does this PR introduce any user-facing change?

Old still works, advised to use new.

How was this patch tested?

Passed existing tests.

@wayneguow wayneguow changed the title [SPARK-42252][Core] Deprecate spark.shuffle.unsafe.file.output.buffer and add a new config [SPARK-42252][CORE] Deprecate spark.shuffle.unsafe.file.output.buffer and add a new config Jan 31, 2023
@wayneguow
Copy link
Contributor Author

cc @mccheah and @HyukjinKwon

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't mind, let's target this proposal for Apache Spark 3.5.0.

@wayneguow
Copy link
Contributor Author

If you don't mind, let's target this proposal for Apache Spark 3.5.0.

Ok, it's good. I will make some changes.

@wayneguow
Copy link
Contributor Author

Gentle ping @dongjoon-hyun, the target version of this proposal has been updated. Could you go on reviewing it when you have time? Thanks.

@dongjoon-hyun
Copy link
Member

Thank you for your patience. Could you rebase this PR?

@@ -22,6 +22,10 @@ license: |
* Table of contents
{:toc}

## Upgrading from Core 3.4 to 3.5

- Since Spark 3.4, `spark.shuffle.unsafe.file.output.buffer` is deprecated though still works. Use `spark.shuffle.localDisk.file.output.buffer` instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be Since Spark 3.5.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm here, updated it, and waiting for GA.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-42252][CORE] Deprecate spark.shuffle.unsafe.file.output.buffer and add a new config [SPARK-42252][CORE] Add spark.shuffle.localDisk.file.output.buffer and deprecate spark.shuffle.unsafe.file.output.buffer May 7, 2023
@@ -22,6 +22,10 @@ license: |
* Table of contents
{:toc}

## Upgrading from Core 3.4 to 3.5

- Since Spark 3.5, `spark.shuffle.unsafe.file.output.buffer` is deprecated though still works. Use `spark.shuffle.localDisk.file.output.buffer` instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove thought still works. deprecated means it still works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

solved

@HyukjinKwon
Copy link
Member

@wayneguow which PR made this change:

The old config is desgined to be used in UnsafeShuffleWriter, but now it has been used in all local shuffle writers through LocalDiskShuffleMapOutputWriter.

? Would be easier for me to review if you can point it out.

@wayneguow
Copy link
Contributor Author

@wayneguow which PR made this change:

The old config is desgined to be used in UnsafeShuffleWriter, but now it has been used in all local shuffle writers through LocalDiskShuffleMapOutputWriter.

? Would be easier for me to review if you can point it out.

@HyukjinKwon The PR is #25007 .

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM from my side.

cc @HyukjinKwon , @cloud-fan , @Ngone51 , @mridulm

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Aug 27, 2023
@github-actions github-actions bot closed this Aug 28, 2023
@dongjoon-hyun dongjoon-hyun reopened this Aug 28, 2023
@dongjoon-hyun
Copy link
Member

Too bad. Sorry but it seems that we forgot this PR, @wayneguow .

@dongjoon-hyun
Copy link
Member

It seems too late because Apache Spark 3.5 is RC3 stage. If we want this still, we can do for Apache Spark 4.0.0.

@github-actions github-actions bot closed this Aug 29, 2023
@wayneguow
Copy link
Contributor Author

Gentle ping @dongjoon-hyun , a pr from some time ago.

@LuciferYang
Copy link
Contributor

friendly ping @yaooqinn Do you have time to help take a look?

@yaooqinn yaooqinn closed this in dd8b05f Jun 14, 2024
@yaooqinn
Copy link
Member

Merged to master, thank you @dongjoon-hyun @LuciferYang @wayneguow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants