Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[serve] remove support for nested DeploymentResponses #47209

Merged
merged 8 commits into from
Aug 21, 2024

Conversation

zcin
Copy link
Contributor

@zcin zcin commented Aug 19, 2024

Why are these changes needed?

Remove support for passing DeploymentResponses in nested objects to downstream serve deployment handle calls, e.g:

handle.remote({"arg": other_handle.remote()})

This doesn't affect the no-op latency, but improves latencies of requests that carry a large payload.

Current handle latencies
Screenshot 2024-08-19 at 5 28 17 PM

New handle latencies:

{
  "handle_p50_latency": 1.7133924999939154,
  "handle_1mb_p50_latency": 2.931859499994971,
  "handle_10mb_p50_latency": 12.14768799999888,
}

Related issue number

closes #46428

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Cindy Zhang <[email protected]>
@zcin zcin added go add ONLY when ready to merge, run all tests and removed go add ONLY when ready to merge, run all tests labels Aug 19, 2024
@zcin zcin added the go add ONLY when ready to merge, run all tests label Aug 20, 2024
@zcin zcin linked an issue Aug 20, 2024 that may be closed by this pull request
Signed-off-by: Cindy Zhang <[email protected]>
@zcin zcin marked this pull request as ready for review August 20, 2024 15:51
@edoakes
Copy link
Contributor

edoakes commented Aug 20, 2024

@zcin let's make sure a good error message is raised if someone tries to pass a nested deployment response. You can do this by defining a __reduce__ method and raising an informative exception there.

@zcin
Copy link
Contributor Author

zcin commented Aug 20, 2024

@zcin let's make sure a good error message is raised if someone tries to pass a nested deployment response. You can do this by defining a __reduce__ method and raising an informative exception there.

@edoakes Added an informative error message in __reduce__ of DeploymentResponse, PTAL!

@edoakes
Copy link
Contributor

edoakes commented Aug 20, 2024

@zcin let's make sure a good error message is raised if someone tries to pass a nested deployment response. You can do this by defining a __reduce__ method and raising an informative exception there.

@edoakes Added an informative error message in __reduce__ of DeploymentResponse, PTAL!

Let's add a test for it too, then LGTM

@zcin zcin merged commit d37b18d into ray-project:master Aug 21, 2024
5 checks passed
@zcin zcin deleted the remove-dep-resp-arg branch August 21, 2024 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[core][serve] 1MB latency performance regression
2 participants