[New Model]: MiniCPM-V-2_6-int4 #7727

tangent2018 · 2024-08-21T09:22:34Z

The model to consider.

https://huggingface.co/openbmb/MiniCPM-V-2_6-int4

The closest model vllm already supports.

No response

What's your difficulty of supporting the model you want?

Load model weight error when run with MiniCPM-V-2_6-int4.

vllm environment:
docker image: vllm/vllm-openai:v0.5.4
pip install bitsandbytes==0.43.3

run example

from transformers import AutoTokenizer
from vllm import LLM, SamplingParams

tokenizer = AutoTokenizer.from_pretrained("openbmb/MiniCPM-V-2_6-int4", trust_remote_code=True)
llm = LLM(
    model=MODEL_NAME,
    gpu_memory_utilization=1,
    trust_remote_code=True,
    max_model_len=2048,
    enforce_eager=True,
)

feedback

rank0]: Traceback (most recent call last):nBMB/MiniCPM-V/code_vllm# nano +685 /usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/minicpmv.py
[rank0]:   File "/home/tangent/AIChat/engines/OpenBMB/MiniCPM-V/code_vllm/local_try.py", line 13, in <module>
[rank0]:     llm = LLM(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/llm.py", line 158, in __init__
[rank0]:     self.llm_engine = LLMEngine.from_engine_args(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 445, in from_engine_args
[rank0]:     engine = cls(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 249, in __init__
[rank0]:     self.model_executor = executor_class(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/executor_base.py", line 47, in __init__
[rank0]:     self._init_executor()
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/gpu_executor.py", line 36, in _init_executor
[rank0]:     self.driver_worker.load_model()
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 139, in load_model
[rank0]:     self.model_runner.load_model()
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 722, in load_model
[rank0]:     self.model = get_model(model_config=self.model_config,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/__init__.py", line 21, in get_model
[rank0]:     return loader.load_model(model_config=model_config,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/loader.py", line 327, in load_model
[rank0]:     model.load_weights(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/minicpmv.py", line 685, in load_weights
[rank0]:     param = params_dict[name]
[rank0]: KeyError: 'llm.layers.0.mlp.down_proj.weight'

The text was updated successfully, but these errors were encountered:

DarkLight1337 · 2024-08-21T12:43:04Z

cc @HwwwwwwwH

seanzhang-zhichen · 2024-08-31T07:56:36Z

AssertionError: Attempted to load weight (torch.Size([2479104, 1])) into parameter (torch.Size([4304, 1152]))
Loading safetensors checkpoint shards: 0% Completed | 0/2 [00:01<?, ?it/s]

Sakura4036 · 2024-09-03T08:34:29Z

+1

LDLINGLINGLING · 2024-09-04T06:29:34Z

hello，The model you are using is the bnb quantization int4 model. This quantization model does not support vllm.

Sakura4036 · 2024-09-06T01:31:19Z

hello，The model you are using is the bnb quantization int4 model. This quantization model does not support vllm.

oh, i see. Thanks

tangent2018 added the new model Requests to new models label Aug 21, 2024

Sakura4036 mentioned this issue Sep 3, 2024

[Bug]: Unable to serve minicpm-v2.6 with GGUF quantization #8096

Closed

1 task

DarkLight1337 mentioned this issue Oct 18, 2024

[RFC]: Multi-modality Support on vLLM #4194

Open

84 tasks

DarkLight1337 mentioned this issue Oct 29, 2024

[Bugfix] Fix prefix strings for quantized VLMs #9772

Merged

mgoin mentioned this issue Oct 31, 2024

[Model] Support bitsandbytes for MiniCPMV #9891

Merged

DarkLight1337 closed this as completed in #9891 Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[New Model]: MiniCPM-V-2_6-int4 #7727

[New Model]: MiniCPM-V-2_6-int4 #7727

tangent2018 commented Aug 21, 2024

DarkLight1337 commented Aug 21, 2024

seanzhang-zhichen commented Aug 31, 2024

Sakura4036 commented Sep 3, 2024

LDLINGLINGLING commented Sep 4, 2024

Sakura4036 commented Sep 6, 2024

[New Model]: MiniCPM-V-2_6-int4 #7727

[New Model]: MiniCPM-V-2_6-int4 #7727

Comments

tangent2018 commented Aug 21, 2024

The model to consider.

The closest model vllm already supports.

What's your difficulty of supporting the model you want?

DarkLight1337 commented Aug 21, 2024

seanzhang-zhichen commented Aug 31, 2024

Sakura4036 commented Sep 3, 2024

LDLINGLINGLING commented Sep 4, 2024

Sakura4036 commented Sep 6, 2024