-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[train+tune] Make path joining OS-agnostic by using Path.as_posix
over os.path.join
#42037
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! I think we should generally use Path.as_posix
everywhere we need to do path joins to be OS agnostic, and pyarrow (which we use under the hood) does not like the mix of posix and windows separators.
I'm wondering if we want to actually go ahead and change ALL of them with this PR? I did a global search on the repo and have found a bunch more instances of os.path.join
in our Ray Train/Tune code.
Another note: we basically don't have comprehensive windows tests, so support for windows for at least Ray Train/Tune is very experimental. If you are not tied down to local development on a windows machine, running on a devbox or a linux subsystem may be a good thing to consider.
Signed-off-by: n3011 <[email protected]>
Co-authored-by: Justin Yu <[email protected]> Signed-off-by: Ishant Mrinal <[email protected]> Signed-off-by: n3011 <[email protected]>
Signed-off-by: n3011 <[email protected]>
@justinvyu thanks, we are dependent on both linux and windows, so this changes will be useful. |
@justinvyu I have updated the PR, It would be great if we can include these changes in the next release, thanks. |
Hey @n30111 sorry for the delay, I'll do another review on this PR, so that we can get it in for Ray 2.10. |
@justinvyu Friendly ping on this PR. |
Signed-off-by: Justin Yu <[email protected]>
Signed-off-by: Justin Yu <[email protected]>
Signed-off-by: Justin Yu <[email protected]>
Signed-off-by: Justin Yu <[email protected]>
Signed-off-by: Justin Yu <[email protected]>
Signed-off-by: Justin Yu <[email protected]>
Signed-off-by: Justin Yu <[email protected]>
Signed-off-by: Justin Yu <[email protected]>
Signed-off-by: Justin Yu <[email protected]>
Signed-off-by: Justin Yu <[email protected]>
@justinvyu Hi Justin, that particular option does not work for forks hosted by organizations (see: https://github.com/orgs/community/discussions/5634 ), but we just added you to the fork/repository to enable you to push the changes you made. |
Signed-off-by: Justin Yu <[email protected]>
@n3011 Is it possible to add a small test to confirm that this PR fixes the failure scenario you're running into? Thanks! https://github.com/ray-project/ray/blob/master/python/ray/train/tests/test_windows.py |
@@ -129,7 +130,7 @@ def get_metadata(self) -> Dict[str, Any]: | |||
|
|||
If no metadata is stored, an empty dict is returned. | |||
""" | |||
metadata_path = os.path.join(self.path, _METADATA_FILE_NAME) | |||
metadata_path = Path(self.path, _METADATA_FILE_NAME).as_posix() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we create a common utility that wraps this Path
/as_posix
logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about it but I prefer just seeing this spelled out. I think the longer term goal is to replace everything with Path
objects and never pass strings around.
@@ -41,8 +41,8 @@ class JsonLogger(Logger): | |||
|
|||
def _init(self): | |||
self.update_config(self.config) | |||
local_file = os.path.join(self.logdir, EXPR_RESULT_FILE) | |||
self.local_out = open(local_file, "a") | |||
local_file = Path(self.logdir, EXPR_RESULT_FILE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this include as_posix()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question for other instances below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just used Path.open
instead of open(str)
below.
This test covers the scenario https://github.com/ray-project/ray/blob/master/python/ray/train/tests/test_trainer_restore.py#L189, is this run on windows system also? |
…path_fix Signed-off-by: Justin Yu <[email protected]>
Path.as_posix
over os.path.join
Thanks for this contribution @n30111! 🚀 Let me know if you get a chance to try it out on nightly. |
…ver `os.path.join` (ray-project#42037) `os.path.join` uses an OS-dependent separator (`\` on windows and `/` on posix systems), which causes some issues when `os.path.join` is used in conjunction with `Path`. The combination can result in a mix of `/` and `\` when running Ray Train/Tune on windows. This runs into issues when passing these paths into `pyarrow.fs`, leading to issues such as `FileNotFound`. --------- Signed-off-by: n3011 <[email protected]> Signed-off-by: Ishant Mrinal <[email protected]> Signed-off-by: Justin Yu <[email protected]> Co-authored-by: n3011 <[email protected]> Co-authored-by: Justin Yu <[email protected]>
Why are these changes needed?
os.path.join
uses an OS-dependent separator (\
on windows and/
on posix systems), which causes some issues whenos.path.join
is used in conjunction withPath
. The combination can result in a mix of/
and\
when running Ray Train/Tune on windows. This runs into issues when passing these paths intopyarrow.fs
, leading to issues such asFileNotFound
.This PR fixes the following issues:
While running training on Windows OS with S3 as storage system, some files are not placed in correct locations.
For example the file
checkpoint_00000/.metadata.json
ends up ascheckpoint_00000\.metadata.json
.Also while restoring a tuner run, it fails, as
tuner.pkl
is not located correctly due to path issue.Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.