common : add default embeddings presets #11677

danbev · 2025-02-05T14:30:56Z

This commit adds default embeddings presets for the following models:

bge-small-en-v1.5
e5-small-v2
gte-small

These can be used with llama-embedding and llama-server.

For example, with llama-embedding:

./build/bin/llama-embedding --embd-gte-small-default -p "Hello, how are you?"

And with llama-server:

./build/bin/llama-server --embd-gte-small-default

And the embeddings endpoint can then be called with a POST request:

curl --request POST \
    --url http://localhost:8080/embeddings \
    --header "Content-Type: application/json" \
    --data '{"input": "Hello, how are you?"}'

I'm not sure if these are the most common embedding models but hopefully this can be a good starting point for discussion and further improvements.

Refs: #10932

This commit adds default embeddings presets for the following models: - bge-small-en-v1.5 - e5-small-v2 - gte-small These can be used with llama-embedding and llama-server. For example, with llama-embedding: ```console ./build/bin/llama-embedding --embd-gte-small-default -p "Hello, how are you?" ``` And with llama-server: ```console ./build/bin/llama-server --embd-gte-small-default ``` And the embeddings endpoint can then be called with a POST request: ```console curl --request POST \ --url http://localhost:8080/embeddings \ --header "Content-Type: application/json" \ --data '{"input": "Hello, how are you?"}' ``` I'm not sure if these are the most common embedding models but hopefully this can be a good starting point for discussion and further improvements. Refs: ggml-org#10932

common/arg.cpp

Default to Q8_0 quantization.

Update gte-small model to ggml-org/gte-small-Q8_0-GGUF.

Update the remaining presets to use the models from ggml-org.

* common : add default embeddings presets This commit adds default embeddings presets for the following models: - bge-small-en-v1.5 - e5-small-v2 - gte-small These can be used with llama-embedding and llama-server. For example, with llama-embedding: ```console ./build/bin/llama-embedding --embd-gte-small-default -p "Hello, how are you?" ``` And with llama-server: ```console ./build/bin/llama-server --embd-gte-small-default ``` And the embeddings endpoint can then be called with a POST request: ```console curl --request POST \ --url http://localhost:8080/embeddings \ --header "Content-Type: application/json" \ --data '{"input": "Hello, how are you?"}' ``` I'm not sure if these are the most common embedding models but hopefully this can be a good starting point for discussion and further improvements. Refs: ggml-org#10932

ggerganov reviewed Feb 6, 2025

View reviewed changes

common/arg.cpp Outdated Show resolved Hide resolved

ggerganov reviewed Feb 6, 2025

View reviewed changes

common/arg.cpp Outdated Show resolved Hide resolved

danbev added 3 commits February 6, 2025 08:56

squash! common : add default embeddings presets [no ci]

e07000b

Default to Q8_0 quantization.

squash! common : add default embeddings presets

61f410f

Update gte-small model to ggml-org/gte-small-Q8_0-GGUF.

squash! common : add default embeddings presets

d1d0a61

Update the remaining presets to use the models from ggml-org.

danbev marked this pull request as ready for review February 6, 2025 10:48

danbev requested a review from ggerganov February 7, 2025 05:06

ggerganov approved these changes Feb 7, 2025

View reviewed changes

danbev merged commit b7552cf into ggml-org:master Feb 7, 2025
46 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

common : add default embeddings presets #11677

common : add default embeddings presets #11677

danbev commented Feb 5, 2025

common : add default embeddings presets #11677

common : add default embeddings presets #11677

Conversation

danbev commented Feb 5, 2025