Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Using Ray with compiled DAG throws the "The compiled graph can't have more than 10 in-flight executions" error #12747

Open
1 task done
tczekajlo opened this issue Feb 4, 2025 · 2 comments
Assignees
Labels
bug Something isn't working ray anything related with ray

Comments

@tczekajlo
Copy link

Your current environment

The output of `python collect_env.py`
Your output of `python collect_env.py` here

🐛 Describe the bug

I'm trying to run DeepSeek R1 using vLLM with Ray and aDAG. As soon as I send the first request I get the following error:

File "/home/ray/anaconda3/lib/python3.12/site-packages/vllm/engine/async_llm_engine.py", line 825, in run_engine_loop
result = task.result()
^^^^^^^^^^^^^
File "/home/ray/anaconda3/lib/python3.12/site-packages/vllm/engine/async_llm_engine.py", line 748, in engine_step
request_outputs = await self.engine.step_async(virtual_engine)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ray/anaconda3/lib/python3.12/site-packages/vllm/engine/async_llm_engine.py", line 353, in step_async
outputs = await self.model_executor.execute_model_async(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ray/anaconda3/lib/python3.12/site-packages/vllm/executor/ray_distributed_executor.py", line 575, in execute_model_async
dag_future = await self.forward_dag.execute_async(serialized_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/dag/compiled_dag_node.py", line 2186, in execute_async
self._raise_if_too_many_inflight_executions()
File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/dag/compiled_dag_node.py", line 1918, in _raise_if_too_many_inflight_executions
raise ray.exceptions.RayCgraphCapacityExceeded(
ray.exceptions.RayCgraphCapacityExceeded: System error: The compiled graph can't have more than 10 in-flight executions, and you currently have 10 in-flight executions. Retrieve an output using ray.get before submitting more requests or increase `_max_inflight_executions`. `dag.experimental_compile(_max_inflight_executions=...)`

The Ray version that I'm using is 2.41.0.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@tczekajlo tczekajlo added the bug Something isn't working label Feb 4, 2025
@darthhexx
Copy link
Contributor

I'm running into the same error with 2.42.0.

@ruisearch42 ruisearch42 added the ray anything related with ray label Feb 19, 2025
@ruisearch42
Copy link
Collaborator

ruisearch42 commented Feb 19, 2025

Hi, this is a known issue in ray 2.41 which we are fixing. Will fix when new ray version is released.

Please downgrade to ray 2.40 for a short-term workaround.

@ruisearch42 ruisearch42 self-assigned this Feb 26, 2025
@hmellor hmellor moved this to Backlog in Ray Feb 28, 2025
@hmellor hmellor added this to Ray Feb 28, 2025
@ruisearch42 ruisearch42 moved this from Backlog to In progress in Ray Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ray anything related with ray
Projects
Status: In progress
Development

No branches or pull requests

3 participants