Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

anyone test chatglm3-6b? set tensor_parallel_size=4, get wrong response #1735

Closed
white-wolf-tech opened this issue Nov 21, 2023 · 5 comments · Fixed by #2379
Closed

anyone test chatglm3-6b? set tensor_parallel_size=4, get wrong response #1735

white-wolf-tech opened this issue Nov 21, 2023 · 5 comments · Fixed by #2379

Comments

@white-wolf-tech
Copy link

Set tensor_parallel_size=1 or tensor_parallel_size=2. the response is OK.

my env info:

vllm==0.2.2
ray==2.8.0
transformers==4.34.0
torch==2.1.0

@zyt1024
Copy link

zyt1024 commented Nov 22, 2023

hello, how to run multi card in single machine? when I set tensor_parallel_size=2,The following error occurred:

File "/home/xxx/offline_inference_chatglm.py", line 7, in <module>
    llm = LLM(model=model_path, trust_remote_code=True, tensor_parallel_size=2)
  File "/home/xxx/vllm/vllm/entrypoints/llm.py", line 93, in __init__
    self.llm_engine = LLMEngine.from_engine_args(engine_args)
  File "/home/xxx/vllm/vllm/engine/llm_engine.py", line 228, in from_engine_args
    distributed_init_method, placement_group = initialize_cluster(
  File "/home/xxx/vllm/vllm/engine/ray_utils.py", line 77, in initialize_cluster
    ray.init(address=ray_address, ignore_reinit_error=True)
  File "/home/xxx/miniforge3/envs/vllm/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
    return func(*args, **kwargs)
  File "/home/xxx/miniforge3/envs/vllm/lib/python3.10/site-packages/ray/_private/worker.py", line 1486, in init
    bootstrap_address = services.canonicalize_bootstrap_address(address, _temp_dir)
  File "/home/xxx/miniforge3/envs/vllm/lib/python3.10/site-packages/ray/_private/services.py", line 530, in canonicalize_bootstrap_address
    addr = get_ray_address_from_environment(addr, temp_dir
File "/home/xxx/miniforge3/envs/vllm/lib/python3.10/site-packages/psutil/_common.py", line 772, in open_binary
    return open(fname, "rb", buffering=FILE_READ_BUFFER_SIZE)
FileNotFoundError: [Errno 2] No such file or directory: '/proc/152830/stat'

@white-wolf-tech
Copy link
Author

white-wolf-tech commented Nov 28, 2023

maybe this ?
image

multi_query_group_num = 2

only support tensor_parallel_size=2?

@white-wolf-tech
Copy link
Author

hello, how to run multi card in single machine? when I set tensor_parallel_size=2,The following error occurred:

File "/home/xxx/offline_inference_chatglm.py", line 7, in <module>
    llm = LLM(model=model_path, trust_remote_code=True, tensor_parallel_size=2)
  File "/home/xxx/vllm/vllm/entrypoints/llm.py", line 93, in __init__
    self.llm_engine = LLMEngine.from_engine_args(engine_args)
  File "/home/xxx/vllm/vllm/engine/llm_engine.py", line 228, in from_engine_args
    distributed_init_method, placement_group = initialize_cluster(
  File "/home/xxx/vllm/vllm/engine/ray_utils.py", line 77, in initialize_cluster
    ray.init(address=ray_address, ignore_reinit_error=True)
  File "/home/xxx/miniforge3/envs/vllm/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
    return func(*args, **kwargs)
  File "/home/xxx/miniforge3/envs/vllm/lib/python3.10/site-packages/ray/_private/worker.py", line 1486, in init
    bootstrap_address = services.canonicalize_bootstrap_address(address, _temp_dir)
  File "/home/xxx/miniforge3/envs/vllm/lib/python3.10/site-packages/ray/_private/services.py", line 530, in canonicalize_bootstrap_address
    addr = get_ray_address_from_environment(addr, temp_dir
File "/home/xxx/miniforge3/envs/vllm/lib/python3.10/site-packages/psutil/_common.py", line 772, in open_binary
    return open(fname, "rb", buffering=FILE_READ_BUFFER_SIZE)
FileNotFoundError: [Errno 2] No such file or directory: '/proc/152830/stat'

reinstall vllm in a new conda env?

@white-wolf-tech
Copy link
Author

@gameofdimension any suggestion? (;´༎ຶД༎ຶ`)

@tanguofu
Copy link

maybe this ? image

multi_query_group_num = 2

only support tensor_parallel_size=2?

where is the code? i didnot find in vllm main branch~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants