Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Prototype] [Core] Make expensive subpackage imports dynamic. #27658

Conversation

clarkzinzow
Copy link
Contributor

@clarkzinzow clarkzinzow commented Aug 8, 2022

Certain Ray subpackages are expensive to import, either due to their size, dependencies, or their import-time logic that must be executed. E.g. a change in a Datasets import caused a regression for the many_tasks nightly test, despite many_tasks only using Ray Core.

This PR delays the import of these expensive subpackages until attribute access, e.g. none of the Datasets code will be run until ray.data is accessed.

Related issue number

Closes #27606, closes #29557

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

def __getattr__(name: str):
if name in _subpackages:
return importlib.import_module("." + name, __name__)
raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
Copy link
Contributor Author

@clarkzinzow clarkzinzow Aug 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that, similar to an object's __getattr__ method, this __getattr__ will only be called if normal attribute lookup (e.g. via module.__getattribute__) fails, so this will not run for any of the above subpackages that are eagerly imported and should only run for the packages in _subpackages (or non-existent module attributes, which will error anyway).

# TODO(Clark): Remove this one we drop Python 3.6 support.
from ray import autoscaler # noqa: F401
from ray import data # noqa: F401
from ray import workflow # noqa: F401
Copy link
Contributor Author

@clarkzinzow clarkzinzow Aug 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could implement something similar for Python 3.6 via a "replace module with class hack":

class module(types.ModuleType):
    __all__ = __all__

    def __init__(self, orig_module: types.ModuleType):
        self._orig_module = orig_module

    def __getattr__(self, name: str):
        try:
            return getattr(self._orig_module, name)
        except AttributeError as e:
            if name in _subpackages:
                return importlib.import_module("." + name, self._orig_module.__name__)
            raise e from None


sys.modules[__name__] = module(sys.modules[__name__])

But IMO this is more hacky than its worth since we're going to be dropping Python 3.6 support in the near-term.

@clarkzinzow clarkzinzow force-pushed the core/feat/dynamic-expensive-subpackage-imports branch from a0a233c to 5e8328f Compare August 8, 2022 21:50
@stale
Copy link

stale bot commented Sep 8, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

  • If you'd like to keep this open, just leave any comment, and the stale label will be removed.

@stale stale bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Sep 8, 2022
@stale
Copy link

stale bot commented Sep 24, 2022

Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message.

Please feel free to reopen or open a new issue if you'd still like it to be addressed.

Again, you can always ask for help on our discussion forum or Ray's public slack channel.

Thanks again for opening the issue!

@stale stale bot closed this Sep 24, 2022
@clarkzinzow clarkzinzow reopened this Oct 18, 2022
@stale stale bot removed the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Oct 18, 2022
@clarkzinzow clarkzinzow added the core Issues that should be addressed in Ray Core label Oct 18, 2022
@clarkzinzow clarkzinzow force-pushed the core/feat/dynamic-expensive-subpackage-imports branch 2 times, most recently from 29d05e9 to d202401 Compare October 20, 2022 02:25
@clarkzinzow clarkzinzow force-pushed the core/feat/dynamic-expensive-subpackage-imports branch from 0a1de12 to 486a339 Compare October 20, 2022 02:51
@rkooo567
Copy link
Contributor

This one seems to be pretty good? What was the original reason why we decided to defer this PR?

@rkooo567
Copy link
Contributor

I'd love to merge it. What's the main concern of this approach?

@rkooo567 rkooo567 self-assigned this Oct 20, 2022
@clarkzinzow
Copy link
Contributor Author

@rkooo567 it was originally deferred because the many_tasks perf gain wasn't very significant (10-15% IIRC) and there are still some odd CI failures to investigate.

Copy link
Contributor

@rkooo567 rkooo567 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Maybe we can just add minimal tests? Also the approach seems reasonable. Is there any concern on top of your head doing this?

from ray import internal # noqa: E402,F401
from ray import util # noqa: E402
from ray import util # noqa: E402,F401
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just OOC, why did it work before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure 🧐

python/ray/__init__.py Show resolved Hide resolved
python/ray/__init__.py Show resolved Hide resolved
python/ray/__init__.py Show resolved Hide resolved

del os
del logging
del sys
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just leave it as it is? A little concern there might have been a reason why we didn't del... haha

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We previously weren't importing sys in this module, so this shouldn't be a change in behavior. sys import was added here: https://github.com/ray-project/ray/pull/27658/files#diff-f95026a08bcb464b58b036437876716d21d3b8630e61258303bcd5384d1d707cR4

@rkooo567
Copy link
Contributor

Also can you merge the latest master? I'd like to run many_tasks to see if the perf regression is fixed

@@ -150,7 +150,7 @@ def check(self):
runtime_env = {"py_modules": [S3_PACKAGE_URI]}

# Note: We should set a bigger timeout if downloads the s3 package slowly.
get_timeout = 10
get_timeout = 2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this test even valid with this reduced timeout?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this change in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test fails otherwise, where the timeout error is never raised

Copy link
Contributor

@rkooo567 rkooo567 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will merge after a couple runs of many_Tasks. Please do not merge latest master @clarkzinzow. I'd like to check if this fixes the issue without @jiaodong'sPR that will be merged soon.

@@ -150,7 +150,7 @@ def check(self):
runtime_env = {"py_modules": [S3_PACKAGE_URI]}

# Note: We should set a bigger timeout if downloads the s3 package slowly.
get_timeout = 10
get_timeout = 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this change in this PR?

@@ -47,6 +54,43 @@ def test_non_ray_modules():
assert "ray" in str(mod), f"Module {mod} should not be reachable via ray.{name}"


def test_dynamic_subpackage_import():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love the test!

@clarkzinzow
Copy link
Contributor Author

@rkooo567 sounds good, although before merging we should double-check the failing test_object_directory_failure test, as well as the tweak that I made to the runtime env test.

@rkooo567 rkooo567 added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Oct 21, 2022
@rkooo567
Copy link
Contributor

promising sign btw. It passes the test https://buildkite.com/ray-project/release-tests-pr/builds/18731#0183f7a5-b88f-4269-b1ba-b03fdf4f9c8e. I am rerunning it

@clarkzinzow
Copy link
Contributor Author

@rkooo567 No spooky CI failures in the most recent run, worker registration timeout adjustment for the runtime env test appears to have done the trick.

rkooo567 pushed a commit that referenced this pull request Oct 22, 2022
This is a quick and relatively safer attempt to address #29324

In #28418 we attempted to unify ray.air utils with shared utils function but triggered expensive ray.data imports.

Where longer term and more robust solution should be #27658
rickyyx pushed a commit that referenced this pull request Oct 24, 2022
This is a quick and relatively safer attempt to address #29324

In #28418 we attempted to unify ray.air utils with shared utils function but triggered expensive ray.data imports.

Where longer term and more robust solution should be #27658
@clarkzinzow
Copy link
Contributor Author

@rkooo567 All currently failing tests were flaky in master at the time of the run, do you think this is good to merge?

@clarkzinzow
Copy link
Contributor Author

Tests are looking good, merging!

@clarkzinzow clarkzinzow merged commit 241a02e into ray-project:master Oct 24, 2022
clarkzinzow added a commit to clarkzinzow/ray that referenced this pull request Oct 25, 2022
clarkzinzow added a commit that referenced this pull request Oct 25, 2022
…#27658)" (#29659)

This reverts commit 241a02e, reverting PR #27658. This PR was making some GCS tests flaky (somehow).
rickyyx added a commit that referenced this pull request Nov 11, 2022
scv119 pushed a commit that referenced this pull request Nov 16, 2022
WeichenXu123 pushed a commit to WeichenXu123/ray that referenced this pull request Dec 19, 2022
This is a quick and relatively safer attempt to address ray-project#29324

In ray-project#28418 we attempted to unify ray.air utils with shared utils function but triggered expensive ray.data imports.

Where longer term and more robust solution should be ray-project#27658

Signed-off-by: Weichen Xu <[email protected]>
WeichenXu123 pushed a commit to WeichenXu123/ray that referenced this pull request Dec 19, 2022
…oject#27658)

Certain Ray subpackages are expensive to import, either due to their size, dependencies, or their import-time logic that must be executed. E.g. a change in a Datasets import caused a regression for the many_tasks nightly test, despite many_tasks only using Ray Core.

This PR delays the import of these expensive subpackages until attribute access, e.g. none of the Datasets code will be run until ray.data is accessed.

Signed-off-by: Weichen Xu <[email protected]>
WeichenXu123 pushed a commit to WeichenXu123/ray that referenced this pull request Dec 19, 2022
…ray-project#27658)" (ray-project#29659)

This reverts commit 241a02e, reverting PR ray-project#27658. This PR was making some GCS tests flaky (somehow).

Signed-off-by: Weichen Xu <[email protected]>
WeichenXu123 pushed a commit to WeichenXu123/ray that referenced this pull request Dec 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. core Issues that should be addressed in Ray Core
Projects
None yet
2 participants