Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Support INT4 for MiniCPM-Llama3-V-2_5 #6932

Closed
LSC527 opened this issue Jul 30, 2024 · 3 comments · Fixed by #9891
Closed

[Feature Request]: Support INT4 for MiniCPM-Llama3-V-2_5 #6932

LSC527 opened this issue Jul 30, 2024 · 3 comments · Fixed by #9891

Comments

@LSC527
Copy link

LSC527 commented Jul 30, 2024

Your current environment

The output of `python collect_env.py`

🐛 Describe the bug

[rank0]: Traceback (most recent call last):
[rank0]:   File "/home/work/minicpm_test/minicpm_vllm.py", line 9, in <module>
[rank0]:     llm = LLM(
[rank0]:   File "/home/work/vllm-main/vllm/entrypoints/llm.py", line 155, in __init__
[rank0]:     self.llm_engine = LLMEngine.from_engine_args(
[rank0]:   File "/home/work/vllm-main/vllm/engine/llm_engine.py", line 441, in from_engine_args
[rank0]:     engine = cls(
[rank0]:   File "/home/work/vllm-main/vllm/engine/llm_engine.py", line 251, in __init__
[rank0]:     self.model_executor = executor_class(
[rank0]:   File "/home/work/vllm-main/vllm/executor/executor_base.py", line 47, in __init__
[rank0]:     self._init_executor()
[rank0]:   File "/home/work/vllm-main/vllm/executor/gpu_executor.py", line 36, in _init_executor
[rank0]:     self.driver_worker.load_model()
[rank0]:   File "/home/work/vllm-main/vllm/worker/worker.py", line 139, in load_model
[rank0]:     self.model_runner.load_model()
[rank0]:   File "/home/work/vllm-main/vllm/worker/model_runner.py", line 722, in load_model
[rank0]:     self.model = get_model(model_config=self.model_config,
[rank0]:   File "/home/work/vllm-main/vllm/model_executor/model_loader/__init__.py", line 21, in get_model
[rank0]:     return loader.load_model(model_config=model_config,
[rank0]:   File "/home/work/vllm-main/vllm/model_executor/model_loader/loader.py", line 283, in load_model
[rank0]:     model.load_weights(
[rank0]:   File "/home/work/vllm-main/vllm/model_executor/models/minicpmv.py", line 680, in load_weights
[rank0]:     param = params_dict[name]
[rank0]: KeyError: 'llm.model.layers.0.mlp.down_proj.weight'
@LSC527 LSC527 added the bug Something isn't working label Jul 30, 2024
@LSC527
Copy link
Author

LSC527 commented Jul 30, 2024

@HwwwwwwwH

@ywang96
Copy link
Member

ywang96 commented Jul 30, 2024

@LSC527 Quantization for VLMs hasn't been supported yet but it's indeed on our roadmap you can check here.

I'm going to change this issue to a feature request given it's not really a bug, thanks!

@ywang96 ywang96 added feature request and removed bug Something isn't working labels Jul 30, 2024
@ywang96 ywang96 changed the title [Bug]: MiniCPM-Llama3-V-2_5-int4 not supported [Feature Request]: Support INT4 for MiniCPM-Llama3-V-2_5 Jul 30, 2024
@mgoin
Copy link
Member

mgoin commented Oct 31, 2024

@LSC527 Do you have another model to test other than openbmb/MiniCPM-Llama3-V-2_5-int4? That model doesn't seem to have a preprocessor config so it fails after model loading with

vllm serve openbmb/MiniCPM-Llama3-V-2_5-int4 --trust-remote-code --quantization bitsandbytes --load-format bitsandbytes

Error:

OSError: openbmb/MiniCPM-Llama3-V-2_5-int4 does not appear to have a file named preprocessor_config.json. Checkout 'https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-int4/tree/main' for available files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants