[Feature Request]: Support INT4 for MiniCPM-Llama3-V-2_5 #6932

LSC527 · 2024-07-30T08:30:08Z

Your current environment

The output of `python collect_env.py`

🐛 Describe the bug

[rank0]: Traceback (most recent call last):
[rank0]:   File "/home/work/minicpm_test/minicpm_vllm.py", line 9, in <module>
[rank0]:     llm = LLM(
[rank0]:   File "/home/work/vllm-main/vllm/entrypoints/llm.py", line 155, in __init__
[rank0]:     self.llm_engine = LLMEngine.from_engine_args(
[rank0]:   File "/home/work/vllm-main/vllm/engine/llm_engine.py", line 441, in from_engine_args
[rank0]:     engine = cls(
[rank0]:   File "/home/work/vllm-main/vllm/engine/llm_engine.py", line 251, in __init__
[rank0]:     self.model_executor = executor_class(
[rank0]:   File "/home/work/vllm-main/vllm/executor/executor_base.py", line 47, in __init__
[rank0]:     self._init_executor()
[rank0]:   File "/home/work/vllm-main/vllm/executor/gpu_executor.py", line 36, in _init_executor
[rank0]:     self.driver_worker.load_model()
[rank0]:   File "/home/work/vllm-main/vllm/worker/worker.py", line 139, in load_model
[rank0]:     self.model_runner.load_model()
[rank0]:   File "/home/work/vllm-main/vllm/worker/model_runner.py", line 722, in load_model
[rank0]:     self.model = get_model(model_config=self.model_config,
[rank0]:   File "/home/work/vllm-main/vllm/model_executor/model_loader/__init__.py", line 21, in get_model
[rank0]:     return loader.load_model(model_config=model_config,
[rank0]:   File "/home/work/vllm-main/vllm/model_executor/model_loader/loader.py", line 283, in load_model
[rank0]:     model.load_weights(
[rank0]:   File "/home/work/vllm-main/vllm/model_executor/models/minicpmv.py", line 680, in load_weights
[rank0]:     param = params_dict[name]
[rank0]: KeyError: 'llm.model.layers.0.mlp.down_proj.weight'

The text was updated successfully, but these errors were encountered:

LSC527 · 2024-07-30T08:31:00Z

@HwwwwwwwH

ywang96 · 2024-07-30T09:12:44Z

@LSC527 Quantization for VLMs hasn't been supported yet but it's indeed on our roadmap you can check here.

I'm going to change this issue to a feature request given it's not really a bug, thanks!

mgoin · 2024-10-31T22:03:39Z

@LSC527 Do you have another model to test other than openbmb/MiniCPM-Llama3-V-2_5-int4? That model doesn't seem to have a preprocessor config so it fails after model loading with

vllm serve openbmb/MiniCPM-Llama3-V-2_5-int4 --trust-remote-code --quantization bitsandbytes --load-format bitsandbytes

Error:

OSError: openbmb/MiniCPM-Llama3-V-2_5-int4 does not appear to have a file named preprocessor_config.json. Checkout 'https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-int4/tree/main' for available files.

LSC527 added the bug Something isn't working label Jul 30, 2024

ywang96 added feature request and removed bug Something isn't working labels Jul 30, 2024

ywang96 changed the title ~~[Bug]: MiniCPM-Llama3-V-2_5-int4 not supported~~ [Feature Request]: Support INT4 for MiniCPM-Llama3-V-2_5 Jul 30, 2024

DarkLight1337 mentioned this issue Oct 18, 2024

[RFC]: Multi-modality Support on vLLM #4194

Open

84 tasks

DarkLight1337 mentioned this issue Oct 29, 2024

[Bugfix] Fix prefix strings for quantized VLMs #9772

Merged

mgoin mentioned this issue Oct 31, 2024

[Bugfix] Fix layer skip logic with bitsandbytes #9887

Merged

mgoin mentioned this issue Oct 31, 2024

[Model] Support bitsandbytes for MiniCPMV #9891

Merged

DarkLight1337 closed this as completed in #9891 Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: Support INT4 for MiniCPM-Llama3-V-2_5 #6932

[Feature Request]: Support INT4 for MiniCPM-Llama3-V-2_5 #6932

LSC527 commented Jul 30, 2024

LSC527 commented Jul 30, 2024

ywang96 commented Jul 30, 2024 •

edited

Loading

mgoin commented Oct 31, 2024

[Feature Request]: Support INT4 for MiniCPM-Llama3-V-2_5 #6932

[Feature Request]: Support INT4 for MiniCPM-Llama3-V-2_5 #6932

Comments

LSC527 commented Jul 30, 2024

Your current environment

🐛 Describe the bug

LSC527 commented Jul 30, 2024

ywang96 commented Jul 30, 2024 • edited Loading

mgoin commented Oct 31, 2024

ywang96 commented Jul 30, 2024 •

edited

Loading