You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With old env , only update vllm==0.2.2 , generation is broken.
So I guess it is Bug with tensor_parallel.
The text was updated successfully, but these errors were encountered:
s-natsubori
changed the title
GPTBigCodeForCausalLM, tensor_parallel_size >= 2, Generation bloken. Is this BUG?
GPTBigCodeForCausalLM, TP >= 2, output is bloken. Is this BUG?
Jan 11, 2024
I updated vLLM version from 0.2.1.post1 to 0.2.7.
Model Generation is Broken when tensor_parallel_size >=2.
(tensor_parallel_size=1 is NOT bloken)
First I found it with my Starchatβ+awq model,
and it repro with non awq model HuggingFaceH4/starchat-beta
and bigcode/starcoderbase-1b too.
Base Env
Old Env
Engine args
Input
generated
New Env
Engine args
Input
generated
or starchat-beta generate
With old env , only update vllm==0.2.2 , generation is broken.
So I guess it is Bug with tensor_parallel.
The text was updated successfully, but these errors were encountered: