Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dask-on-Ray] Add Dask config helper, set task-based shuffle by default. #21114

Conversation

clarkzinzow
Copy link
Contributor

@clarkzinzow clarkzinzow commented Dec 15, 2021

Dask default's to a disk-based shuffle even thought we're using a distributed scheduler, which appears to be resulting in dropped data since the filesystem isn't shared across nodes. Dask Distributed manually sets the shuffle algorithm in the global config to the task-based shuffle, which the Dask-on-Ray scheduler should probably do as well.

This PR adds a Dask config helper, enable_dask_on_ray, that sets Dask-on-Ray as the default scheduler along with changing the default shuffle to a task-based shuffle. The shuffle method can still be overridden by the user by manually specifying df.set_index(shuffle="disk").

Related issue number

Closes #20108, closes #20992, closed #14048

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Copy link
Contributor

@ericl ericl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I've always wanted this override to happen by default!

@clarkzinzow
Copy link
Contributor Author

clarkzinzow commented Dec 15, 2021

@ericl Once I saw that Dask Distributed was doing it, I felt a lot better about it. 😄 Although they still have the option Client(set_as_default=False) to keep from setting the global config, but I think that we can wait to expose that until a user asks for it.

Copy link
Contributor

@rkooo567 rkooo567 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job finding this. Is there a way to add a test?

@ericl
Copy link
Contributor

ericl commented Dec 15, 2021

//python/ray/util/dask:test_dask_optimization FAILED in 3 out of 3 in 9.9s

@ericl ericl added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Dec 15, 2021
@clarkzinzow clarkzinzow added tests-ok The tagger certifies test failures are unrelated and assumes personal liability. and removed @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. labels Dec 16, 2021
Copy link
Contributor

@rkooo567 rkooo567 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Request change to add test cases?

@clarkzinzow
Copy link
Contributor Author

@rkooo567 Sure, I guess I could remove the explicit setting of the scheduler in the scheduler tests now that it happens automatically at import time.

@rkooo567 rkooo567 added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Dec 16, 2021
@rkooo567
Copy link
Contributor

Does this mean we allow to disable this in some cases? If so, I think we should have a test for that (disabling case)

@clarkzinzow clarkzinzow changed the title [Dask-on-Ray] Manually set global Dask scheduler config, along with task-based shuffle. [Dask-on-Ray] Add Dask config helper, set task-based shuffle by default. Dec 17, 2021
@clarkzinzow
Copy link
Contributor Author

@rkooo567 @ericl changed it to a config-setting helper, enable_dask_on_ray, which we can encourage Dask-on-Ray users to use.

@clarkzinzow clarkzinzow removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Dec 17, 2021
@rkooo567
Copy link
Contributor

Lgtm!!

Copy link
Contributor

@ericl ericl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also OK with this approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
4 participants