-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RuntimeEnv, Windows] Fix working_dir, pip & conda for windows #28589
[RuntimeEnv, Windows] Fix working_dir, pip & conda for windows #28589
Conversation
Signed-off-by: Jeroen Bédorf <[email protected]>
Signed-off-by: Jeroen Bédorf <[email protected]>
Signed-off-by: Jeroen Bédorf <[email protected]>
Signed-off-by: Jeroen Bédorf <[email protected]>
Signed-off-by: Jeroen Bédorf <[email protected]>
@mattip Given that you previously worked on Windows related PRs could you help find some reviewers? Thanks! |
@architkulkarni Can you help move this forward? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution!
Regarding the conda tests:
As such not all tests are enabled for Windows as they would keep failing, and leave behind sizeable temporary files.
Certainly if they're failing we shouldn't enable them in the PR, but for those tests where the only issue is leaving behind temporary files, can we try to enable them in this PR? If the temporary files somehow cause a problem, we'll see it in the CI run for this PR.
else: | ||
context.command_prefix += [ | ||
_PathHelper.get_virtualenv_activate_command(target_dir) | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make get_virtualenv_activate_command
always return List[str]
to avoid this if-else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I was thinking more about it. A cleaner solution would to have always add lists to the command_prefix
item and then do the combination of the items to string for Linux/Mac and keep it as list items for Windows. So we would have to update it for each of the three plugins.
python/ray/_private/test_utils.py
Outdated
|
||
if set(items) == whitelist: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the term "whitelist" I would expect subset instead of == here, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, will update.
The tests will pass when adding the same construct as added in some of the other tests:
Basically ignore the top level folder in the |
Signed-off-by: Jeroen Bédorf <[email protected]>
…indows Signed-off-by: Jeroen Bédorf <[email protected]>
@architkulkarni Has something changed with the buildkite settings? After updating the code and importing the latest master I'm unable to view the build results/details. This makes it impossible to see why some of the steps failed. Looking at the links it appears the address previously was |
@jbedorf Not 100% sure if it's related, but you're right that there was a recent change to our CI pipeline, but it shouldn't require any special action on our part. I would suggest merging the latest master but it looks like you've already done that. Which parts can you no longer see? From what I can tell at https://buildkite.com/ray-project/oss-ci-build-pr/builds/396 |
I see, I guess the project settings have changed regarding anonymous access. When I click on the current links it tells me I need a buildkite account. Whereas previous builds could be accessed by anyone. You can see it by opening the buildkite links in an incognito window. Anyway, I'll update my Linux environment and take a look at the failing tests. |
@jbedorf Thanks for your feedback about the buildkite visibility, this is not intended. We are looking into fixing it! |
…dows Signed-off-by: Jeroen Bédorf <[email protected]>
Signed-off-by: Jeroen Bédorf <[email protected]>
Signed-off-by: Jeroen Bédorf <[email protected]>
Signed-off-by: Jeroen Bédorf <[email protected]>
@architkulkarni For example:
|
https://flaky-tests.ray.io/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Before I merge this, it would be good to have a review from Windows expert @mattip who should be back from OOO soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice. There is only once small nit with changing shell=True
but I think that is OK. I like that there are more tests enabled and the use of pytest for the tmpdir fixture.
@@ -82,7 +82,8 @@ def exec_worker(self, passthrough_args: List[str], language: Language): | |||
) | |||
logger.debug(f"Exec'ing worker with command: {command_str}") | |||
if sys.platform == "win32": | |||
subprocess.run([executable, *passthrough_args]) | |||
cmd = [*self.command_prefix, executable, *passthrough_args] | |||
subprocess.Popen(cmd, shell=True).wait() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some slight differences when running with shell=True
. Is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that the tests ran more stable with shell=True
. I honestly don't understand exactly why as the tests that fail with shell=False
often pass when I run them individually. Maybe it has to do with the way pytest runs/keeps the state between runs, but I couldn't pin it down exactly.
@pytest.mark.skipif( | ||
os.environ.get("CI") and sys.platform != "linux", | ||
reason="This test is only run on linux CI machines.", | ||
) | ||
def test_pip_with_env_vars(start_cluster): | ||
def test_pip_with_env_vars(start_cluster, tmp_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, this is a welcome change
…roject#28589) * Fix working_dir, conda and pip options for Windows Signed-off-by: Jeroen Bédorf <[email protected]> * More test fixes Signed-off-by: Jeroen Bédorf <[email protected]> * More test fixes Signed-off-by: Jeroen Bédorf <[email protected]> * More test fixes and style format fixes Signed-off-by: Jeroen Bédorf <[email protected]> * Style fixes Signed-off-by: Jeroen Bédorf <[email protected]> * Restructure, enable more tests Signed-off-by: Jeroen Bédorf <[email protected]> * Initial fixes to tests Signed-off-by: Jeroen Bédorf <[email protected]> * Fix lint errors Signed-off-by: Jeroen Bédorf <[email protected]> * Fix style Signed-off-by: Jeroen Bédorf <[email protected]> Signed-off-by: Jeroen Bédorf <[email protected]> Signed-off-by: Weichen Xu <[email protected]>
Why are these changes needed?
Various options of the
runtime_env
method, when using Windows, are currently broken due to the changes in this PR. That PR removed the use of thecommand_prefix
in the context. This work restores the usage of that. However, due to the different launch methods between Linux and Windows the changes had to be made in multiple locations to ensure that lists are returned instead of strings.To prevent this breaking in the future various tests have been fixed/enabled. However, there are some lingering issues with the tests:
working_dir
is not deleted due to not being able to delete a folder that is in use on Windows. This is accounted for in the tests by whitelisting that particular folder in various tests.conda
environment option the deleting is not complete. Similar the deleting fails because a number of files and folders are in use. Typically they are in use by theray.util.client.server
process. As such not all tests are enabled for Windows as they would keep failing, and leave behind sizeable temporary files.Other:
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.