Skip to content

Commit

Permalink
[serve] Deprecate passing DeploymentResponse to handle (#46806)
Browse files Browse the repository at this point in the history
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

Deprecate passing a deployment response to a handle by reference
([docs](https://docs.ray.io/en/latest/serve/model_composition.html#advanced-pass-a-deploymentresponse-by-reference)).

Supporting this requires using pyobjscanner to recursively process and
pickle dump all args, which adds to the latency especially for large
payloads.

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
  • Loading branch information
zcin authored Aug 5, 2024
1 parent 2fe85f9 commit b514157
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 3 deletions.
14 changes: 11 additions & 3 deletions doc/source/serve/model_composition.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,12 +128,20 @@ Example:
:language: python
```

## Advanced: Pass a DeploymentResponse "by reference"
## Advanced: Pass a DeploymentResponse in a nested object [DEPRECATED]

:::{warning}
Passing a `DeploymentResponse` to downstream handle calls in nested objects is deprecated and will be removed in the next release.
Ray Serve will no longer handle converting them to Ray `ObjectRef`s for you.
Please manually use `DeploymentResponse._to_object_ref()` instead to pass the corresponding object reference in nested objects.

Passing a `DeploymentResponse` object as a top-level argument or keyword argument is still supported.
:::

By default, when you pass a `DeploymentResponse` to another `DeploymentHandle` call, Ray Serve passes the result of the `DeploymentResponse` directly to the downstream method once it's ready.
However, in some cases you might want to start executing the downstream call before the result is ready. For example, to do some preprocessing or fetch a file from remote storage.
To accomplish this behavior, pass the `DeploymentResponse` "by reference" by embedding it in another Python object, such as a list or dictionary.
When you pass responses by reference, Ray Serve replaces them with Ray `ObjectRef`s instead of the resulting value and they can start executing before the result is ready.
To accomplish this behavior, pass the `DeploymentResponse` embedded in another Python object, such as a list or dictionary.
When you pass responses in a nested object, Ray Serve replaces them with Ray `ObjectRef`s instead of the resulting value and they can start executing before the result is ready.

The example below has two deployments: a preprocessor and a downstream model that takes the output of the preprocessor.
The downstream model has two methods:
Expand Down
7 changes: 7 additions & 0 deletions python/ray/serve/_private/router.py
Original file line number Diff line number Diff line change
Expand Up @@ -456,6 +456,13 @@ async def _resolve_deployment_responses(
)
elif isinstance(obj, DeploymentResponse):
responses.append(obj)
if obj not in request_args and obj not in request_kwargs.values():
logger.warning(
"Passing `DeploymentResponse` objects in nested objects to "
"downstream handle calls is deprecated and will not be "
"supported in the future. Pass them as top-level "
"args or kwargs instead."
)

# This is no-op replacing the object with itself. The purpose is to make
# sure both object refs and object ref generator are not getting pinned
Expand Down

0 comments on commit b514157

Please sign in to comment.