Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[serve] Deprecate passing DeploymentResponse to handle #46806

Merged
merged 6 commits into from
Aug 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 11 additions & 3 deletions doc/source/serve/model_composition.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,12 +128,20 @@ Example:
:language: python
```

## Advanced: Pass a DeploymentResponse "by reference"
## Advanced: Pass a DeploymentResponse in a nested object [DEPRECATED]

:::{warning}
Passing a `DeploymentResponse` to downstream handle calls in nested objects is deprecated and will be removed in the next release.
akshay-anyscale marked this conversation as resolved.
Show resolved Hide resolved
Ray Serve will no longer handle converting them to Ray `ObjectRef`s for you.
Please manually use `DeploymentResponse._to_object_ref()` instead to pass the corresponding object reference in nested objects.

Passing a `DeploymentResponse` object as a top-level argument or keyword argument is still supported.
:::

By default, when you pass a `DeploymentResponse` to another `DeploymentHandle` call, Ray Serve passes the result of the `DeploymentResponse` directly to the downstream method once it's ready.
However, in some cases you might want to start executing the downstream call before the result is ready. For example, to do some preprocessing or fetch a file from remote storage.
To accomplish this behavior, pass the `DeploymentResponse` "by reference" by embedding it in another Python object, such as a list or dictionary.
When you pass responses by reference, Ray Serve replaces them with Ray `ObjectRef`s instead of the resulting value and they can start executing before the result is ready.
To accomplish this behavior, pass the `DeploymentResponse` embedded in another Python object, such as a list or dictionary.
When you pass responses in a nested object, Ray Serve replaces them with Ray `ObjectRef`s instead of the resulting value and they can start executing before the result is ready.

The example below has two deployments: a preprocessor and a downstream model that takes the output of the preprocessor.
The downstream model has two methods:
Expand Down
7 changes: 7 additions & 0 deletions python/ray/serve/_private/router.py
Original file line number Diff line number Diff line change
Expand Up @@ -449,6 +449,13 @@ async def _resolve_deployment_responses(
)
elif isinstance(obj, DeploymentResponse):
responses.append(obj)
if obj not in request_args and obj not in request_kwargs.values():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you've tested that this check works and we don't spuriously print?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, this will only print for deployment response objects that weren't passed in as top level args/kwargs.

logger.warning(
"Passing `DeploymentResponse` objects in nested objects to "
"downstream handle calls is deprecated and will not be "
"supported in the future. Pass them as top-level "
"args or kwargs instead."
)

# This is no-op replacing the object with itself. The purpose is to make
# sure both object refs and object ref generator are not getting pinned
Expand Down