Update Makefile #2482

awhill19 · 2023-08-01T21:25:46Z

#Change:

Use the environment variable CUDA_NATIVE_ARCH if present to set NVCC arch. Otherwise, use native as currently.

#Reasoning:

Running make LLAMA_CUBLAS=1 errors with nvcc fatal : Value 'native' is not defined for option 'gpu-architecture' make: *** [Makefile:249: ggml-cuda.o] Error 1

Making the -arch flag to nvcc configurable allows a specific arch to be chosen, which solves this problem.

Use the environment variable `CUDA_NATIVE_ARCH` if present to set NVCC arch. Otherwise, use `native`.

slaren · 2023-08-04T12:02:49Z

Doesn't this do essentially the same as CUDA_DOCKER_ARCH?

awhill19 · 2023-08-05T01:50:32Z

Doesn't this do essentially the same as CUDA_DOCKER_ARCH?

It does, but CUDA_DOCKER_ARCH also passes -Wno-deprecated-gpu-targets which nvcc seems to have trouble with in some environments. It also just seemed a bit non-intuitive for me to need Docker related environment variables for a simple native installation.

Perhaps just testing whether -Wno-deprecated-gpu-targets is necessary (or warrants a separate arg) would be more appropriate. This fork is just the fix I've been using for my own servers.

cebtenzzre · 2023-10-22T20:38:57Z

With this change, CUDA_DOCKER_ARCH now implies -arch=native. I assume that wasn't intended.

ggerganov · 2023-10-23T19:44:58Z

Hm, in what scenario this is wrong?

cebtenzzre · 2023-10-23T20:33:33Z

Hm, in what scenario this is wrong?

Before this PR, you could use e.g. CUDA_DOCKER_ARCH=sm_52 to set the NVCC arch, which would evaluate to:

-Wno-deprecated-gpu-targets -arch=sm_52

After this PR, make LLAMA_CUBLAS=1 CUDA_DOCKER_ARCH=sm_52 ggml-cuda.o shows this in the nvcc command:

-Wno-deprecated-gpu-targets -arch=sm_52 -arch=native

I think the last -arch option will take precedence.

This reverts commit 96981f3. See: #2482 (comment)

* master: (350 commits) speculative : ensure draft and target model vocab matches (ggerganov#3812) llama : correctly report GGUFv3 format (ggerganov#3818) simple : fix batch handling (ggerganov#3803) cuda : improve text-generation and batched decoding performance (ggerganov#3776) server : do not release slot on image input (ggerganov#3798) batched-bench : print params at start log : disable pid in log filenames server : add parameter -tb N, --threads-batch N (ggerganov#3584) (ggerganov#3768) server : do not block system prompt update (ggerganov#3767) sync : ggml (conv ops + cuda MSVC fixes) (ggerganov#3765) cmake : add missed dependencies (ggerganov#3763) cuda : add batched cuBLAS GEMM for faster attention (ggerganov#3749) Add more tokenizer tests (ggerganov#3742) metal : handle ggml_scale for n%4 != 0 (close ggerganov#3754) Revert "make : add optional CUDA_NATIVE_ARCH (ggerganov#2482)" issues : separate bug and enhancement template + no default title (ggerganov#3748) Update special token handling in conversion scripts for gpt2 derived tokenizers (ggerganov#3746) llama : remove token functions with `context` args in favor of `model` (ggerganov#3720) Fix baichuan convert script not detecing model (ggerganov#3739) make : add optional CUDA_NATIVE_ARCH (ggerganov#2482) ...

This reverts commit 96981f3. See: ggerganov/llama.cpp#2482 (comment)

Update Makefile

0081032

Use the environment variable `CUDA_NATIVE_ARCH` if present to set NVCC arch. Otherwise, use `native`.

ggerganov merged commit 96981f3 into ggerganov:master Oct 22, 2023

ggerganov added a commit that referenced this pull request Oct 23, 2023

Revert "make : add optional CUDA_NATIVE_ARCH (#2482)"

e393259

This reverts commit 96981f3. See: #2482 (comment)

YuMJie pushed a commit to YuMJie/powerinfer that referenced this pull request Oct 25, 2024

Revert "make : add optional CUDA_NATIVE_ARCH (#2482)"

55550d1

This reverts commit 96981f3. See: ggerganov/llama.cpp#2482 (comment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Makefile #2482

Update Makefile #2482

awhill19 commented Aug 1, 2023 •

edited

Loading

slaren commented Aug 4, 2023

awhill19 commented Aug 5, 2023 •

edited

Loading

cebtenzzre commented Oct 22, 2023

ggerganov commented Oct 23, 2023

cebtenzzre commented Oct 23, 2023

Update Makefile #2482

Update Makefile #2482

Conversation

awhill19 commented Aug 1, 2023 • edited Loading

slaren commented Aug 4, 2023

awhill19 commented Aug 5, 2023 • edited Loading

cebtenzzre commented Oct 22, 2023

ggerganov commented Oct 23, 2023

cebtenzzre commented Oct 23, 2023

awhill19 commented Aug 1, 2023 •

edited

Loading

awhill19 commented Aug 5, 2023 •

edited

Loading