AI: `-nvidia "all"` is not obvious what it does #3370

Titan-Node · 2025-01-24T08:27:26Z

Describe the bug
When the flag -nvidia "all" is used for AI inference, the aiModels.json must match number of GPUs installed, else it will try run all GPUs with the first item in the list.

Not sure if this is the expected behavior.

To Reproduce

Install 2 GPUs in a machine
Set -nvidia "all"
Set your aiModels.json to include only the LLM model

[
    {
    "pipeline": "llm",
    "model_id": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "price_per_unit": 80000000,
    "pixels_per_unit": 1000000,
    "warm": true
    }
]

If the second GPU has less than 24GB, it will fail to launch to container and time out.

Expected behavior
Obviously if we specify the GPUs ie -nvidia 0,1 then we assume the aiModels.json will have two models loaded, it works.
i.e.

[
    {
    "pipeline": "llm",
    "model_id": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "price_per_unit": 80000000,
    "pixels_per_unit": 1000000,
    "warm": true
    },
    {
      "pipeline": "text-to-image",
      "model_id": "ByteDance/SDXL-Lightning",
      "price_per_unit": 4768371,
      "warm": true
    }
]

I think the documentation needs to be updated, but I am unsure if this is actually the logic for the "all" value.

Set Up
Slot 0 - 3090 (24GB Ram)
Slot 1 - 2080 ti (11GB Ram)

The text was updated successfully, but these errors were encountered:

github-actions bot added the status: triage this issue has not been evaluated yet label Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI: `-nvidia "all"` is not obvious what it does #3370

AI: `-nvidia "all"` is not obvious what it does #3370

Titan-Node commented Jan 24, 2025

AI: -nvidia "all" is not obvious what it does #3370

AI: -nvidia "all" is not obvious what it does #3370

Comments

Titan-Node commented Jan 24, 2025

AI: `-nvidia "all"` is not obvious what it does #3370

AI: `-nvidia "all"` is not obvious what it does #3370