-
-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When tp>1 vllm not work (Qwen2.5-VL-72B) #13124
Comments
(qwen25vl) lzh@instance-aw1rhmsz-5:~/Code/qwen2_5vl/vllm$ python collect_env.py Collecting environment information... OS: Ubuntu 22.04.1 LTS (x86_64) Python version: 3.12.8 | packaged by Anaconda, Inc. | (main, Dec 11 2024, 16:31:09) [GCC 11.2.0] (64-bit runtime) Nvidia driver version: 535.154.05 CPU: Versions of relevant libraries: Legend: X = Self NIC Legend: NIC0: mlx5_0 LD_LIBRARY_PATH=/pfs/mt-hiEd6E/home/lzh/anaconda3/envs/qwen25vl/lib/python3.12/site-packages/cv2/../../lib64: @ywang96 The above is the result of collect_env.py |
You might have to show the whole strak trace since the error you're showing is just the normal frontend server failure when the engine fails to start. |
When I use Qwen2.5-VL-72B-Instruct, there is a significant discrepancy between the inference results obtained using PyTorch and those using VLLM with same generation args. |
|
@LaoWangGB I am facing the same issue here. Were you able to solve it? |
set distributed_executor_backend="ray" while init llm. check sampling args in vllm, such as temperature, frequency_penalty, repetition_penalty...the diffs become small but still exist. |
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "/home/lzh/anaconda3/envs/qwen25vl/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 911, in
uvloop.run(run_server(args))
File "/home/lzh/anaconda3/envs/qwen25vl/lib/python3.12/site-packages/uvloop/init.py", line 109, in run
return __asyncio.run(
^^^^^^^^^^^^^^
File "/home/lzh/anaconda3/envs/qwen25vl/lib/python3.12/asyncio/runners.py", line 195, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/home/lzh/anaconda3/envs/qwen25vl/lib/python3.12/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
File "/home/lzh/anaconda3/envs/qwen25vl/lib/python3.12/site-packages/uvloop/init.py", line 61, in wrapper
return await main
^^^^^^^^^^
File "/home/lzh/anaconda3/envs/qwen25vl/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 875, in run_server
async with build_async_engine_client(args) as engine_client:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lzh/anaconda3/envs/qwen25vl/lib/python3.12/contextlib.py", line 210, in aenter
return await anext(self.gen)
^^^^^^^^^^^^^^^^^^^^^
File "/home/lzh/anaconda3/envs/qwen25vl/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 136, in build_async_engine_client
async with build_async_engine_client_from_engine_args(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lzh/anaconda3/envs/qwen25vl/lib/python3.12/contextlib.py", line 210, in aenter
return await anext(self.gen)
^^^^^^^^^^^^^^^^^^^^^
File "/home/lzh/anaconda3/envs/qwen25vl/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 230, in build_async_engine_client_from_engine_args
raise RuntimeError(
RuntimeError: Engine process failed to start. See stack trace for the root cause.
Could you please solve it? I seem to have encountered a similar error. This error will be reported when tp>1.
@ywang96 How can I solve it?
Originally posted by @ZhonghaoLu in #12604 (comment)
The text was updated successfully, but these errors were encountered: