Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serve] Modify max_concurrency in actor options to respect max_ongoing_requests (#47681) #48274

Merged

Conversation

akyang-anyscale
Copy link
Contributor

Why are these changes needed?

This PR modifies the actor_options used when deploying replicas. Deployment will use the configured max_ongoing_requests attribute of the deployment config as the replica's max_concurrency if
the concurrency is not explicitly set. This is to prevent replica's max_concurrency from capping
max_ongoing_requests.

Related issue number

Closes #47681

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

…ng_requests

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
@akyang-anyscale akyang-anyscale changed the title [Serve] Modify max_concurrency in actor options to respect max_ongoing_requests (#47681) [Serve] Modify max_concurrency in actor options to respect max_ongoing_requests (https://github.com/ray-project/ray/issues/47681) Oct 25, 2024
@akyang-anyscale akyang-anyscale changed the title [Serve] Modify max_concurrency in actor options to respect max_ongoing_requests (https://github.com/ray-project/ray/issues/47681) [Serve] Modify max_concurrency in actor options to respect max_ongoing_requests (#47681) Oct 25, 2024
Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
@akyang-anyscale akyang-anyscale force-pushed the alexyang/serve-max-ongoing-capped branch from f64fe8c to d4b81e3 Compare October 28, 2024 17:37
Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
@akyang-anyscale akyang-anyscale added the go add ONLY when ready to merge, run all tests label Oct 28, 2024
@akyang-anyscale akyang-anyscale marked this pull request as ready for review October 28, 2024 22:51
Copy link
Contributor

@zcin zcin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! were you able to test this e2e manually?

# unintentional behavior, use `max_ongoing_requests` to override
# the Actor's `max_concurrency` if it is not explicitly set.
if (
"max_concurrency" not in actor_options
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ray_actor_options is limited to a structured schema RayActorOptionsSchema, which doesn't allow max_concurrency, so I think you don't need to check for max_concurrency

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this was trying to future proof

# Default max_concurrency option in @ray.remote for threaded actors.
DEFAULT_MAX_CONCURRENCY_THREADED = 1

# Default max_concurrency option in @ray.remote for aysnc actors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Default max_concurrency option in @ray.remote for aysnc actors.
# Default max_concurrency option in @ray.remote for async actors.

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
@akyang-anyscale
Copy link
Contributor Author

Yes tested with the script provided in the issue

from ray import serve
from ray.serve.handle import DeploymentHandle
import asyncio

@serve.deployment(max_ongoing_requests=4096)
class Model:
    @serve.batch(max_batch_size=2048, batch_wait_timeout_s=2)
    async def __call__(self, ls: list[int]) -> list[int]:
        print(f"Length of input list: {len(ls)}")
        return ls

async def main() -> None:
    handle: DeploymentHandle = serve.run(Model.bind())
    await asyncio.gather(*[handle.remote(i) for i in range(2048)])

if __name__ == "__main__":
    asyncio.run(main())

Before:

$ python test.py
Length of input list: 1000 
Length of input list: 1000 
Length of input list: 48

After:

$ python test.py
Length of input list: 2048

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
@zcin zcin enabled auto-merge (squash) October 29, 2024 16:31
@zcin
Copy link
Contributor

zcin commented Oct 29, 2024

@akyang-anyscale lint is failing

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
@github-actions github-actions bot disabled auto-merge October 29, 2024 18:13
Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
@zcin zcin merged commit d560ecd into ray-project:master Oct 30, 2024
5 checks passed
@akyang-anyscale akyang-anyscale deleted the alexyang/serve-max-ongoing-capped branch October 30, 2024 04:22
Jay-ju pushed a commit to Jay-ju/ray that referenced this pull request Nov 5, 2024
…oing_requests` (ray-project#47681) (ray-project#48274)

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->
This PR modifies the actor_options used when deploying replicas.
Deployment will use the configured `max_ongoing_requests` attribute of
the deployment config as the replica's `max_concurrency` if
the concurrency is not explicitly set. This is to prevent replica's
`max_concurrency` from capping
`max_ongoing_requests`.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->
Closes ray-project#47681



Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
JP-sDEV pushed a commit to JP-sDEV/ray that referenced this pull request Nov 14, 2024
…oing_requests` (ray-project#47681) (ray-project#48274)

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->
This PR modifies the actor_options used when deploying replicas.
Deployment will use the configured `max_ongoing_requests` attribute of
the deployment config as the replica's `max_concurrency` if
the concurrency is not explicitly set. This is to prevent replica's
`max_concurrency` from capping
`max_ongoing_requests`.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->
Closes ray-project#47681



Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
mohitjain2504 pushed a commit to mohitjain2504/ray that referenced this pull request Nov 15, 2024
…oing_requests` (ray-project#47681) (ray-project#48274)

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->
This PR modifies the actor_options used when deploying replicas.
Deployment will use the configured `max_ongoing_requests` attribute of
the deployment config as the replica's `max_concurrency` if
the concurrency is not explicitly set. This is to prevent replica's
`max_concurrency` from capping
`max_ongoing_requests`.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->
Closes ray-project#47681

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
Signed-off-by: mohitjain2504 <mohit.jain@dream11.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[serve] max_ongoing_requests limited by max_concurrency in actor
2 participants