You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
E.g. python -m vllm.entrypoints.openai.api_server --model meta-llama/Llama-2-7b-hf and
from openai import OpenAI
# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"
client = OpenAI(
api_key=openai_api_key,
base_url=openai_api_base,
)
completion = client.completions.create(model="meta-llama/Llama-2-7b-hf",
prompt="Hello!",
logprobs=5)
print("Completion result:", completion)
works but python -m vllm.entrypoints.openai.api_server --model meta-llama/Llama-2-7b-chat-hf
from openai import OpenAI
# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"
client = OpenAI(
api_key=openai_api_key,
base_url=openai_api_base,
)
completion = client.completions.create(model="meta-llama/Llama-2-7b-chat-hf",
messages= [{'role' : 'user', 'content': "Hello!"}],
logprobs=5)
print("Completion result:", completion)
does not.
For some reason log probs are currently only set up for completions end points, not chat end points. I think the fix would simply entail changing serving_chat.pylink to incorporate log probs like serving_completion.py does link.
No description provided.
The text was updated successfully, but these errors were encountered: