Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AI: -nvidia "all" is not obvious what it does #3370

Open
Titan-Node opened this issue Jan 24, 2025 · 0 comments
Open

AI: -nvidia "all" is not obvious what it does #3370

Titan-Node opened this issue Jan 24, 2025 · 0 comments
Labels
status: triage this issue has not been evaluated yet

Comments

@Titan-Node
Copy link

Describe the bug
When the flag -nvidia "all" is used for AI inference, the aiModels.json must match number of GPUs installed, else it will try run all GPUs with the first item in the list.

Not sure if this is the expected behavior.

To Reproduce

  1. Install 2 GPUs in a machine
  2. Set -nvidia "all"
  3. Set your aiModels.json to include only the LLM model
[
    {
    "pipeline": "llm",
    "model_id": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "price_per_unit": 80000000,
    "pixels_per_unit": 1000000,
    "warm": true
    }
]
  1. If the second GPU has less than 24GB, it will fail to launch to container and time out.

Expected behavior
Obviously if we specify the GPUs ie -nvidia 0,1 then we assume the aiModels.json will have two models loaded, it works.
i.e.

[
    {
    "pipeline": "llm",
    "model_id": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "price_per_unit": 80000000,
    "pixels_per_unit": 1000000,
    "warm": true
    },
    {
      "pipeline": "text-to-image",
      "model_id": "ByteDance/SDXL-Lightning",
      "price_per_unit": 4768371,
      "warm": true
    }
]

I think the documentation needs to be updated, but I am unsure if this is actually the logic for the "all" value.

Set Up
Slot 0 - 3090 (24GB Ram)
Slot 1 - 2080 ti (11GB Ram)

@github-actions github-actions bot added the status: triage this issue has not been evaluated yet label Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: triage this issue has not been evaluated yet
Projects
None yet
Development

No branches or pull requests

1 participant