Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serve][Part1] Update the tests to use graph deploy #26310

Merged
merged 13 commits into from
Jul 13, 2022

Conversation

sihanwang41
Copy link
Contributor

@sihanwang41 sihanwang41 commented Jul 5, 2022

Why are these changes needed?

  1. Update tests to use serve.run() instead of .deploy()
  2. get_handle() func removed
  3. some unit tests not changed because it is for testing the deprecating function.
  4. Add fix to include version for deployment graph. (Otherwise we always destroy all replicas and reprovision news in adding num replicas)
  5. To make it reviewable, this migration will have two prs.

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@sihanwang41 sihanwang41 changed the title [Serve] Update the tests to use graph deploy [Serve][WIP] Update the tests to use graph deploy Jul 5, 2022
@sihanwang41 sihanwang41 changed the title [Serve][WIP] Update the tests to use graph deploy [Serve][Part1] Update the tests to use graph deploy Jul 7, 2022
@edoakes
Copy link
Contributor

edoakes commented Jul 8, 2022

@sihanwang41 can you explain in more detail the problem when you don't include the version change?

@edoakes
Copy link
Contributor

edoakes commented Jul 8, 2022

I think the behavior we want is for redeploying the graph to cause all replicas to be restarted.

@sihanwang41
Copy link
Contributor Author

I think the behavior we want is for redeploying the graph to cause all replicas to be restarted.

does it mean when user want to change num_replicas from 100 to 200, we will have to recreate all replicas? @edoakes

this is what i test. After change replicas to 3, all old exsiting replicas will be destroyed

from ray import serve
import ray
from ray.serve.drivers import DAGDriver
import requests
from ray.serve.controller import ServeController
from ray.serve.deployment_state import ReplicaState


@serve.deployment(route_prefix="/ModelA", version="v1", num_replicas=2)
class ModelA:
    def forward(self):
        return 1

client = serve.start()
handle = serve.run(ModelA.bind())
replicas = ray.get(
    client._controller._dump_replica_states_for_testing.remote("ModelA")
)
running_replicas = replicas.get([ReplicaState.RUNNING])
print([replica.replica_tag for replica in running_replicas])

serve.run(ModelA.options(num_replicas=3).bind())

replicas = ray.get(
    client._controller._dump_replica_states_for_testing.remote("ModelA")
)
running_replicas = replicas.get([ReplicaState.RUNNING])
print([replica.replica_tag for replica in running_replicas])

@sihanwang41
Copy link
Contributor Author

Synced with @edoakes offline, revert version tests.

@edoakes edoakes added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Jul 12, 2022
Stefan-1313 pushed a commit to Stefan-1313/ray_mod that referenced this pull request Aug 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants