llama : accept a list of devices to use to offload a model #10497

slaren · 2024-11-25T16:14:43Z

Adds parameters --device (-dev) and --device-draft (-devd) to specify a comma-separated list of devices to use with the main and draft models. Use --list-devices to see the list of available devices. --dev none will cause GPU usage to be disabled completely, including for large batches.

API changes: adds devices parameter to llama_model_params. If NULL, uses all the available devices (same as the current behavior). ggml_backend_reg_by_name and ggml_backend_dev_by_name are now case-insensitive.

…#10497) * llama : accept a list of devices to use to offload a model * accept `--dev none` to completely disable offloading * fix dev list with dl backends * rename env parameter to LLAMA_ARG_DEVICE for consistency

slaren added the breaking change Changes that break ABIs, APIs, file formats, or other forms of backwards compatibility. label Nov 25, 2024

llama : accept a list of devices to use to offload a model

f4457cb

slaren force-pushed the sl/llama-dev-selection branch from 8a793ce to f4457cb Compare November 25, 2024 16:16

slaren added 2 commits November 25, 2024 17:24

accept --dev none to completely disable offloading

d1dc8fb

fix dev list with dl backends

d6cf918

slaren force-pushed the sl/llama-dev-selection branch from a6fc1d9 to d6cf918 Compare November 25, 2024 17:01

fix other examples

acf43cc

github-actions bot added examples server ggml changes relating to the ggml tensor library for machine learning labels Nov 25, 2024

ggerganov approved these changes Nov 25, 2024

View reviewed changes

rename env parameter to LLAMA_ARG_DEVICE for consistency

42f61c8

slaren merged commit 10bce04 into master Nov 25, 2024
3 of 55 checks passed

slaren deleted the sl/llama-dev-selection branch November 25, 2024 18:30

This was referenced Nov 25, 2024

changelog : libllama API #9289

Open

server : add speculative decoding support #10455

Merged

mostlygeek mentioned this pull request Nov 26, 2024

Misc. bug: -sm row does not work with --device #10533

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : accept a list of devices to use to offload a model #10497

llama : accept a list of devices to use to offload a model #10497

slaren commented Nov 25, 2024 •

edited

Loading

llama : accept a list of devices to use to offload a model #10497

llama : accept a list of devices to use to offload a model #10497

Conversation

slaren commented Nov 25, 2024 • edited Loading

slaren commented Nov 25, 2024 •

edited

Loading