Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server : bugfix - stop server from sending empty json during oai chat completions #10694

Closed
wants to merge 2 commits into from

Conversation

m18coppola
Copy link
Contributor

Since #10643, the webui crashes when the model finished generating a response:
llama_crash
Upon investigation, I found that it was because an empty json object is sent from the server before the on_complete json is sent:

$ curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "stream": true,
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'

...

data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":"?"}}],"created":1733515737,"id":"chatcmpl-LtMeOt2U3SPq40U172bsTkOzMiL8X4xC","model":"gpt-3.5-turbo-0613","object":"chat.completion.chunk"}

data: {}

data: {"choices":[{"finish_reason":"stop","index":0,"delta":{}}],"created":1733515738,"id":"chatcmpl-LtMeOt2U3SPq40U172bsTkOzMiL8X4xC","model":"gpt-3.5-turbo-0613","object":"chat.completion.chunk","timings":{"prompt_n":1,"prompt_ms":67.807,"prompt_per_token_ms":67.807,"prompt_per_second":14.747739908858966,"predicted_n":25,"predicted_ms":1543.024,"predicted_per_token_ms":61.72096,"predicted_per_second":16.20195149265339},"usage":{"completion_tokens":25,"prompt_tokens":23,"total_tokens":48}}

data: [DONE]

I made a change such that the server skips sending any empty json objects returned from the completion results stream.

@ggerganov
Copy link
Owner

It might be better to find what is the root cause for having this empty json object and prevent from being created in the first place.

@ggerganov
Copy link
Owner

I traced it and this empty object is created when an end-of-turn token is generated and the --special CLI arg is not passed to llama-server. In this case, the special token is not rendered and the content is empty.

I think a better solution would be to send a valid object with empty content instead of skipping this message from the stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants