Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Enables /score endpoint for embedding models #12846

Merged
merged 41 commits into from
Feb 21, 2025

Conversation

gmarinho2
Copy link
Contributor

@gmarinho2 gmarinho2 commented Feb 6, 2025

Enables /score endpoint for all embedding models via API.

Issue: [FEATURE] Enables offline /score for embedding models (2/2)

gmarinho2 and others added 18 commits January 21, 2025 11:18
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Copy link

github-actions bot commented Feb 6, 2025

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@mergify mergify bot added the frontend label Feb 6, 2025
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
@gmarinho2 gmarinho2 force-pushed the scoring-openai branch 3 times, most recently from 3599577 to 8b0b41f Compare February 7, 2025 18:53
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
@gmarinho2 gmarinho2 force-pushed the scoring-openai branch 2 times, most recently from c4762c7 to 2d44144 Compare February 7, 2025 23:53
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
@mergify mergify bot added the documentation Improvements or additions to documentation label Feb 13, 2025
maxdebayser and others added 4 commits February 13, 2025 14:20
As stated in the PR that introduced the /rerank API,
the code was 90% the same. With this refactoring it
also becomes possible to use the /rerank API with
embedding models

Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
@maxdebayser
Copy link
Contributor

@DarkLight1337 and @K-Mistele , this PR is ready for review. The changes are:

  • Refactor OpenAIServingScores to enable the /score endpoint for embedding models. In this case the cosine similarity between the embedding pairs is returned.
  • Reuse OpenAIServingScores to also serve the /rerank endpoints. As a result of this refactoring, the /rerank endpoint now also support embedding models.
  • Improve /score unit tests to compare the results with the reference implementation (sentence-transformers)
  • Add an embedding model to the /score unit tests
  • Update the documentation.

Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this! Some initial comments.

Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM now. @K-Mistele can you take another pass as this as well?

gmarinho2 and others added 3 commits February 18, 2025 12:51
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
@gmarinho2
Copy link
Contributor Author

@joerunde

@K-Mistele
Copy link
Contributor

Thanks for the ping @DarkLight1337 - looks good to me.

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) February 21, 2025 03:36
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 21, 2025
@DarkLight1337
Copy link
Member

Since main is broken, I'm force-merging this - test_score and test_rerank passed prior to the latest merge.

@simon-mo simon-mo merged commit 1c3c975 into vllm-project:main Feb 21, 2025
40 of 58 checks passed
@gmarinho2 gmarinho2 deleted the scoring-openai branch February 21, 2025 18:17
michaelrglass pushed a commit to michaelrglass/vllm that referenced this pull request Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation frontend ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants