-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]: LoRA support for InternVLChatModel #9495
Comments
Could you provide your LoRA configuration? I might be able to implement this quickly |
@jeejeelee pls help..its quite urgent. thanks! "lora_rank": 32, |
I will. can you provibe |
@jeejeelee |
Currently, vllm does not support LoRA inference for the visual encoder and projector components. It only supports LoRA inference for language models. Even if I were to implement support lorafor |
@jeejeelee what is the easiest workaround for this? deploying using merged lora is affecting performance..i would want to deploy the original weights..is there some alternative I can explore (fast inference) for prod deployment. |
How many LoRAs do you have? If you only have one, merging the weights would lead to higher inference efficiency. |
i have 2 loras main and adalora. sharing the whole config here for reference:
|
@jeejeelee efficiency is better but results are worse than unmerged one |
@AkshataABhat II have preliminarily completed the work of supporting LoRA in InternVL, but note that it can still only add LoRA to the language model. You can try it out. See my branch: https://github.com/jeejeelee/vllm/tree/internvl-lora . Additionally, I suggest you could try adding LoRA only to the language model and retrain it |
@jeejeelee I discussed this with my team..and they want support for the lora for vision model as well.. |
Hi~ I have some questions about this branch. Does it support InternVL2-8B (awq) with multi LoRAs? Thank you! |
If internvl supports awq, I think it should be fine. You can try it out, and if you have any issues, I'll help you solve them. |
@jeejeelee hi~, I have some questions about awq with LoRAs? When language model(awq) with multi LoRAs, LoRA weights need to be quantized? |
You don't need to |
Your current environment
vllm version = 0.6.1
Model Input Dumps
No response
🐛 Describe the bug
The output of `command:`
vllm version = 0.6.1. InternVLChat is in list of supported models.
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: