[Model] Add support for Qwen2 for embeddings #5611

mgoin · 2024-06-17T20:16:03Z

This works for ssmits/Qwen2-7B-Instruct-embed-base

>>> from vllm import LLM
>>> model = LLM("ssmits/Qwen2-7B-Instruct-embed-base")
>>> outputs = model.encode("Hello!")
Processed prompts: 100%|█████████████████████████████████████| 1/1 [00:00<00:00,  2.10it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]
>>> outputs
[EmbeddingRequestOutput(request_id='0', outputs=EmbeddingOutput(embedding=3584), prompt_token_ids=[9707, 0], finished=True)]

However, this doesn't work for Alibaba-NLP/gte-Qwen2-7B-instruct since its config still says it is a Qwen2ForCausalLM, where we have been relying on embedding models to be of the type XModel: https://huggingface.co/Alibaba-NLP/gte-Qwen2-7B-instruct/blob/main/config.json

vllm/model_executor/models/qwen2.py

prattcmp · 2024-06-20T03:59:54Z

+1 this would be great

waters222 · 2024-06-22T02:29:56Z

I think ppl can edit the config file to change it from Qwen2ForCausalLM to Qwen2Model

Semihal · 2024-07-04T08:59:37Z

Should we expect a merge in the near future?

mgoin · 2024-07-04T12:17:11Z

@Semihal thanks for the reminder! I just merged with main so if it is green we can land

0xWelt · 2024-07-08T07:43:03Z

Can you pass "pytest tests/models/test_embedding.py"? I got "AssertionError: Not all values are within 0.01 of 1.0" with model "ssmits/Qwen2-7B-Instruct-embed-base" .

mgoin · 2024-07-08T14:13:24Z

@Nickydusk No it does not pass, it seems there is a correctness issue between the implementations so this is not ready to land. This is low on my priority to investigate so if anyone would like to take over this PR, feel welcome and ping me for review.

starmemda · 2024-07-09T01:53:35Z

@youkaichao @WoosukKwon @zhuohan123 @simon-mo @DarkLight1337
Could you please check this PR and merge it? gte-Qwen2 is a Qwen2ForCausalLM embedding model which is quite efficient.

waters222 · 2024-07-09T19:53:54Z

vllm/model_executor/models/qwen2_embedding.py

+            if name.startswith(prefix):
+                name = name[len(prefix):]
+
+            if "rotary_emb.inv_freq" in name:


need to add following line for loading model Alibaba-NLP/gte-Qwen2-7B-instruct

if "lm_head.weight" in name: continue

0xWelt · 2024-07-10T04:00:37Z

@mgoin @prattcmp @waters222 @Semihal @starmemda

I have opened a new PR #6282 to directly support 'gte-Qwen2' embedding models without modifying its Qwen2ForCausalLM architecture. We can collaborate on reviewing the code and merge it in the near future.

ybbz · 2024-08-01T02:49:03Z

Can this feature be merged?

mgoin · 2024-08-01T18:17:07Z

@ybbz The output is not correct/matching HF, as seen in the test. Anybody is welcome to debug it!

github-actions · 2024-11-01T02:06:39Z

This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you!

mergify · 2024-11-01T02:07:17Z

This pull request has merge conflicts that must be resolved before it can be
merged. @mgoin please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

DarkLight1337 · 2024-11-02T07:24:02Z

Ping in case you forgot about this @mgoin . It should be quite straightforward to add embedding models now.

DarkLight1337 · 2024-11-09T15:14:42Z

Closing as superseded by #10184

Add support for Qwen2 for embeddings

901897c

mgoin changed the title ~~[Model} Add support for Qwen2 for embeddings~~ [Model] Add support for Qwen2 for embeddings Jun 17, 2024

mgoin mentioned this pull request Jun 17, 2024

[Feature]: support Qwen2 embedding #5600

Closed

mgoin added 2 commits June 17, 2024 20:32

Fix embedding loading

d9b0796

Fix loading with ssmits/Qwen2-7B-Instruct-embed-base

11a7a8a

robertgshaw2-redhat reviewed Jun 18, 2024

View reviewed changes

vllm/model_executor/models/qwen2.py Outdated Show resolved Hide resolved

mgoin linked an issue Jun 20, 2024 that may be closed by this pull request

[Feature]: support Qwen2 embedding #5600

Closed

mgoin mentioned this pull request Jun 25, 2024

[Bug]: Internal Server Error when hosting Alibaba-NLP/gte-Qwen2-7B-instruct #5827

Closed

Update based on comments and add test

daff511

mgoin marked this pull request as ready for review June 25, 2024 17:50

Merge branch 'upstream-main' into qwen2-embedding

9d203ee

waters222 reviewed Jul 9, 2024

View reviewed changes

0xWelt mentioned this pull request Jul 10, 2024

[Model] Add support for 'gte-Qwen2' embedding models #6282

Closed

noooop mentioned this pull request Sep 25, 2024

[RFC]: Support encode only models by Workflow Defined Engine #8453

Closed

1 task

github-actions bot added the stale Over 90 days of inactivity label Nov 1, 2024

mergify bot added the needs-rebase label Nov 1, 2024

github-actions bot removed the stale Over 90 days of inactivity label Nov 4, 2024

github-actions bot added the unstale Recieved activity after being labelled stale label Nov 4, 2024

DarkLight1337 mentioned this pull request Nov 9, 2024

[Model] Support Qwen2 embeddings and use tags to select model tests #10184

Merged

DarkLight1337 closed this Nov 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] Add support for Qwen2 for embeddings #5611

[Model] Add support for Qwen2 for embeddings #5611

mgoin commented Jun 17, 2024 •

edited by DarkLight1337

Loading

prattcmp commented Jun 20, 2024

waters222 commented Jun 22, 2024

Semihal commented Jul 4, 2024

mgoin commented Jul 4, 2024

0xWelt commented Jul 8, 2024

mgoin commented Jul 8, 2024

starmemda commented Jul 9, 2024

waters222 Jul 9, 2024

0xWelt commented Jul 10, 2024 •

edited

Loading

ybbz commented Aug 1, 2024

mgoin commented Aug 1, 2024

github-actions bot commented Nov 1, 2024

mergify bot commented Nov 1, 2024

DarkLight1337 commented Nov 2, 2024

DarkLight1337 commented Nov 9, 2024

[Model] Add support for Qwen2 for embeddings #5611

[Model] Add support for Qwen2 for embeddings #5611

Conversation

mgoin commented Jun 17, 2024 • edited by DarkLight1337 Loading

prattcmp commented Jun 20, 2024

waters222 commented Jun 22, 2024

Semihal commented Jul 4, 2024

mgoin commented Jul 4, 2024

0xWelt commented Jul 8, 2024

mgoin commented Jul 8, 2024

starmemda commented Jul 9, 2024

waters222 Jul 9, 2024

Choose a reason for hiding this comment

0xWelt commented Jul 10, 2024 • edited Loading

ybbz commented Aug 1, 2024

mgoin commented Aug 1, 2024

github-actions bot commented Nov 1, 2024

mergify bot commented Nov 1, 2024

DarkLight1337 commented Nov 2, 2024

DarkLight1337 commented Nov 9, 2024

mgoin commented Jun 17, 2024 •

edited by DarkLight1337

Loading

0xWelt commented Jul 10, 2024 •

edited

Loading