neural-chat-7b-v3-1 vllm.entrypoints.openai.api_server request return multiple turns of dialogue not one #2256
Unanswered
cninnovationai
asked this question in
Q&A
Replies: 1 comment
-
Add IMO this should be something you should be able to configure as additional stop-words when you start vLLM, so you don't have to deal with adding them manually. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm loading the checkpoints via vllm.
The inference command which I'm running is:
The problem is that multiple turns of dialogue were obtained in one request.
curl request like this:
Most of the time the response is normal like this:
but sometime it went wrong, response like this(It returned multiple rounds of dialogue -> [INST] xxxxxx [/INST] ):
Does anyone know how to solve this problem?I would be very grateful
Beta Was this translation helpful? Give feedback.
All reactions