Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Qwen2.5-VL-7B Instruct Lora model load failed #12872

Open
1 task done
shenhaitao010 opened this issue Feb 7, 2025 · 3 comments
Open
1 task done

[Bug]: Qwen2.5-VL-7B Instruct Lora model load failed #12872

shenhaitao010 opened this issue Feb 7, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@shenhaitao010
Copy link

Your current environment

vllm==0.7.2
transformers==4.49.0.dev0

🐛 Describe the bug

env CUDA_VISIBLE_DEVICES=0 vllm serve /data/Qwen/Qwen2.5-VL-7B-Instruct
--served-model-name Qwen2___5-VL-7B-Instruct --limit-mm-per-prompt image=1
--max-model-len 16384 --dtype bfloat16 --gpu-memory-utilization 0.95 --host 0.0.0.0 --port 8001
--trust-remote-code --max-num-batched-tokens 16384 --max-num-seqs 1
--enable-lora --max-loras 5 --lora-modules vl_lora=/home/sht/LLaMA-Factory-new-20250205/LLaMA-Factory-main/saves/Qwen2.5-VL-7B-Instruct/lora/vl_images_tags_v1_train_2025-02-05-14-28-34

Loading the Lora model reported the following error:

[rank0]:[W207 15:18:18.529757338 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator())
Traceback (most recent call last):
File "/home/sht/.conda/envs/glm4/bin/vllm", line 8, in
sys.exit(main())
File "/home/sht/.conda/envs/glm4/lib/python3.10/site-packages/vllm/scripts.py", line 204, in main
args.dispatch_function(args)
File "/home/sht/.conda/envs/glm4/lib/python3.10/site-packages/vllm/scripts.py", line 44, in serve
uvloop.run(run_server(args))
File "/home/sht/.conda/envs/glm4/lib/python3.10/site-packages/uvloop/init.py", line 82, in run
return loop.run_until_complete(wrapper())
File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
File "/home/sht/.conda/envs/glm4/lib/python3.10/site-packages/uvloop/init.py", line 61, in wrapper
return await main
File "/home/sht/.conda/envs/glm4/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 879, in run_server
await init_app_state(engine_client, model_config, app.state, args)
File "/home/sht/.conda/envs/glm4/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 765, in init_app_state
await state.openai_serving_models.init_static_loras()
File "/home/sht/.conda/envs/glm4/lib/python3.10/site-packages/vllm/entrypoints/openai/serving_models.py", line 96, in init_static_loras
raise ValueError(load_result.message)
ValueError: While loading /home/sht/LLaMA-Factory-new-20250205/LLaMA-Factory-main/saves/Qwen2.5-VL-7B-Instruct/lora/vl_images_tags_v1_train_2025-02-05-14-28-34, expected target modules in ['o_proj', 'fc2', 'v_proj', 'down_proj', 'attn.proj', 'mlp.2', 'mlp.0', 'gate_up_proj', 'qkv', 'fc1', 'k_proj', 'q_proj', 'gate_projup_proj'] but received ['language_model.model.layers.0.mlp.gate_proj', 'language_model.model.layers.0.mlp.gate_proj', 'language_model.model.layers.0.mlp.up_proj', 'language_model.model.layers.0.mlp.up_proj', 'language_model.model.layers.1.mlp.gate_proj', 'language_model.model.layers.1.mlp.gate_proj', 'language_model.model.layers.1.mlp.up_proj', 'language_model.model.layers.1.mlp.up_proj', 'language_model.model.layers.10.mlp.gate_proj', 'language_model.model.layers.10.mlp.gate_proj', 'language_model.model.layers.10.mlp.up_proj', 'language_model.model.layers.10.mlp.up_proj', 'language_model.model.layers.11.mlp.gate_proj', 'language_model.model.layers.11.mlp.gate_proj', 'language_model.model.layers.11.mlp.up_proj', 'language_model.model.layers.11.mlp.up_proj', 'language_model.model.layers.12.mlp.gate_proj', 'language_model.model.layers.12.mlp.gate_proj', 'language_model.model.layers.12.mlp.up_proj', 'language_model.model.layers.12.mlp.up_proj', 'language_model.model.layers.13.mlp.gate_proj', 'language_model.model.layers.13.mlp.gate_proj', 'language_model.model.layers.13.mlp.up_proj', 'language_model.model.layers.13.mlp.up_proj', 'language_model.model.layers.14.mlp.gate_proj', 'language_model.model.layers.14.mlp.gate_proj', 'language_model.model.layers.14.mlp.up_proj', 'language_model.model.layers.14.mlp.up_proj', 'language_model.model.layers.15.mlp.gate_proj', 'language_model.model.layers.15.mlp.gate_proj', 'language_model.model.layers.15.mlp.up_proj', 'language_model.model.layers.15.mlp.up_proj', 'language_model.model.layers.16.mlp.gate_proj', 'language_model.model.layers.16.mlp.gate_proj', 'language_model.model.layers.16.mlp.up_proj', 'language_model.model.layers.16.mlp.up_proj', 'language_model.model.layers.17.mlp.gate_proj', 'language_model.model.layers.17.mlp.gate_proj', 'language_model.model.layers.17.mlp.up_proj', 'language_model.model.layers.17.mlp.up_proj', 'language_model.model.layers.18.mlp.gate_proj', 'language_model.model.layers.18.mlp.gate_proj', 'language_model.model.layers.18.mlp.up_proj', 'language_model.model.layers.18.mlp.up_proj', 'language_model.model.layers.19.mlp.gate_proj', 'language_model.model.layers.19.mlp.gate_proj', 'language_model.model.layers.19.mlp.up_proj', 'language_model.model.layers.19.mlp.up_proj', 'language_model.model.layers.2.mlp.gate_proj', 'language_model.model.layers.2.mlp.gate_proj', 'language_model.model.layers.2.mlp.up_proj', 'language_model.model.layers.2.mlp.up_proj', 'language_model.model.layers.20.mlp.gate_proj', 'language_model.model.layers.20.mlp.gate_proj', 'language_model.model.layers.20.mlp.up_proj', 'language_model.model.layers.20.mlp.up_proj', 'language_model.model.layers.21.mlp.gate_proj', 'language_model.model.layers.21.mlp.gate_proj', 'language_model.model.layers.21.mlp.up_proj', 'language_model.model.layers.21.mlp.up_proj', 'language_model.model.layers.22.mlp.gate_proj', 'language_model.model.layers.22.mlp.gate_proj', 'language_model.model.layers.22.mlp.up_proj', 'language_model.model.layers.22.mlp.up_proj', 'language_model.model.layers.23.mlp.gate_proj', 'language_model.model.layers.23.mlp.gate_proj', 'language_model.model.layers.23.mlp.up_proj', 'language_model.model.layers.23.mlp.up_proj', 'language_model.model.layers.24.mlp.gate_proj', 'language_model.model.layers.24.mlp.gate_proj', 'language_model.model.layers.24.mlp.up_proj', 'language_model.model.layers.24.mlp.up_proj', 'language_model.model.layers.25.mlp.gate_proj', 'language_model.model.layers.25.mlp.gate_proj', 'language_model.model.layers.25.mlp.up_proj', 'language_model.model.layers.25.mlp.up_proj', 'language_model.model.layers.26.mlp.gate_proj', 'language_model.model.layers.26.mlp.gate_proj', 'language_model.model.layers.26.mlp.up_proj', 'language_model.model.layers.26.mlp.up_proj', 'language_model.model.layers.27.mlp.gate_proj', 'language_model.model.layers.27.mlp.gate_proj', 'language_model.model.layers.27.mlp.up_proj', 'language_model.model.layers.27.mlp.up_proj', 'language_model.model.layers.3.mlp.gate_proj', 'language_model.model.layers.3.mlp.gate_proj', 'language_model.model.layers.3.mlp.up_proj', 'language_model.model.layers.3.mlp.up_proj', 'language_model.model.layers.4.mlp.gate_proj', 'language_model.model.layers.4.mlp.gate_proj', 'language_model.model.layers.4.mlp.up_proj', 'language_model.model.layers.4.mlp.up_proj', 'language_model.model.layers.5.mlp.gate_proj', 'language_model.model.layers.5.mlp.gate_proj', 'language_model.model.layers.5.mlp.up_proj', 'language_model.model.layers.5.mlp.up_proj', 'language_model.model.layers.6.mlp.gate_proj', 'language_model.model.layers.6.mlp.gate_proj', 'language_model.model.layers.6.mlp.up_proj', 'language_model.model.layers.6.mlp.up_proj', 'language_model.model.layers.7.mlp.gate_proj', 'language_model.model.layers.7.mlp.gate_proj', 'language_model.model.layers.7.mlp.up_proj', 'language_model.model.layers.7.mlp.up_proj', 'language_model.model.layers.8.mlp.gate_proj', 'language_model.model.layers.8.mlp.gate_proj', 'language_model.model.layers.8.mlp.up_proj', 'language_model.model.layers.8.mlp.up_proj', 'language_model.model.layers.9.mlp.gate_proj', 'language_model.model.layers.9.mlp.gate_proj', 'language_model.model.layers.9.mlp.up_proj', 'language_model.model.layers.9.mlp.up_proj']. Please verify that the loaded LoRA module is correct

Using transformers to load the Lora model works normally:
from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
from qwen_vl_utils import process_vision_info
from peft import LoraConfig
from peft import PeftModel,PeftMixedModel

model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
"/data/Qwen/Qwen2.5-VL-7B-Instruct", torch_dtype="auto", device_map="auto"
)

LORA_PATH = '/home/sht/LLaMA-Factory-new-20250205/LLaMA-Factory-main/saves/Qwen2.5-VL-7B-Instruct/lora/vl_images_tags_v1_train_2025-02-05-14-28-34'

model = PeftModel.from_pretrained(model, model_id=LORA_PATH,adapter_name='vl_lora')
model.eval()

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@shenhaitao010 shenhaitao010 added the bug Something isn't working label Feb 7, 2025
@shenhaitao010
Copy link
Author

Lora model trained using LLaMA-Factory

@jeejeelee
Copy link
Collaborator

This model has issues with LoRA, I will fix it as soon as possible

@jeejeelee
Copy link
Collaborator

#12905 should fix this isssue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants