-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[data] Optimize dataset metadata read/write in Ray client #21939
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.. does that mean we cannot really do anything heavy in the driver if ray client is a first class citizen.
python/ray/data/dataset.py
Outdated
blocks, metadata = zip(*self._blocks.get_blocks_with_metadata()) | ||
write_results = datasource.do_write(blocks, metadata, **write_args) | ||
|
||
# Prepare write in a remote task so that in Ray client mode, we aren't |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the cost for non-ray client mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't matter much (a few ms)
Yeah I think that's a good bias for Ray native libraries (for example, Tune already avoids running heavy things in the driver). |
I didn't have that awareness before but yeah I (really everyone) will keep that in mind for future code. Not sure if there is something helps us to remember this (probably one item on the review action list is checking to see if the code works nicely with ray client). |
Hmm this seems to be causing test_basic_actors to hang somehow. |
Blocked on #21970 |
|
Why are these changes needed?
In Ray client mode, the client may not have the same low-latency access to the datasource as the cluster nodes. Run datasource prepare_read/do_write in a remote task to make sure it doesn't run on the driver node.
Related issue number
Checks
scripts/format.sh
to lint the changes in this PR.