-
-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Internal Server Error when hosting Alibaba-NLP/gte-Qwen2-7B-instruct #5827
Comments
@markkofler vLLM doesn't have support for Qwen2 embedding models yet. I have a WIP PR here for this: #5611 From the logs I see the model is not running as an embedding model Looking at the config file for the model, it seems that it is registered to run as You can try changing this to be |
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you! |
@mgoin Thanks for the feedback! I would leave this issue open till the model is supported, as I see some pull requests are open for adding the support. |
Ok @markkofler I'll mark this as keep-open. Could you link the PRs here? |
Thanks @hmellor ! The PRs are already linked above. |
Your current environment
Using latest available docker image: vllm/vllm-openai:v0.5.0.post1
🐛 Describe the bug
I am getting as response "Internal Server Error" when calling the /v1/embeddings endpoint of the Kubernetes-deployed version of the model x. I am using the following json request as body:
For reference, here is the log of the vLLM container:
Would be great if somebody could help me to get the model running as embedding model for our colleagues. Any idea what could be wrong?
Thanks in advance!
The text was updated successfully, but these errors were encountered: