Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[serve] Make serve.shutdown() shut down remote Serve applications #23476

Merged
merged 10 commits into from
Mar 25, 2022

Conversation

shrekris-anyscale
Copy link
Contributor

@shrekris-anyscale shrekris-anyscale commented Mar 24, 2022

Why are these changes needed?

serve.shutdown() does not do anything if there is no cached handle to a Serve controller. If a user connects to a remote Ray cluster without running serve.start(), they cannot use serve.shutdown() to shut down the remote Serve application.

E.g. this does nothing, even if a Serve controller is running at "cluster_address" in namespace "x":

import ray
from ray import serve

ray.init(address="cluster_address", namespace="x")
serve.shutdown()

This change makes serve.shutdown() first try to connect to a Serve controller before attempting to shut down. This allows a user to directly shut down a running Serve instance without potentially having to start a Serve instance with serve.start() first.

Related issue number

Closes #22254.

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
      • New unit tests are added to test_standalone2.py.

python/ray/serve/api.py Show resolved Hide resolved
@edoakes
Copy link
Contributor

edoakes commented Mar 25, 2022

@shrekris-anyscale there was a windows test failure in test_standalone. Retried, but can you verify if it's a real failure?

@edoakes edoakes added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Mar 25, 2022
@shrekris-anyscale
Copy link
Contributor Author

@edoakes It is a real failure– Windows has some issues when trying to run temp files. I can either skip the test on Windows or refactor it to pass. I'll look into this.

@shrekris-anyscale shrekris-anyscale removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Mar 25, 2022
@shrekris-anyscale
Copy link
Contributor Author

shrekris-anyscale commented Mar 25, 2022

@edoakes I removed the context manager from the test to handle the Windows issues (see https://stackoverflow.com/a/15590253 for context), and it looks like test_standalone2.py is passing in Windows and Linux now.

@edoakes
Copy link
Contributor

edoakes commented Mar 25, 2022

@shrekris-anyscale can we update the DeploymentNode docstring to say "deployment graph" instead of "multi-deployment applicaton" ?

@edoakes
Copy link
Contributor

edoakes commented Mar 25, 2022

Sigh... wrong PR, ignore above. Clearly reviewing while distracted today...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Serve shutdown does not work for detached serve instance
3 participants