-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: ROCm CUDA error #8504
Comments
Same but with RX 7600 XT (gfx1102) |
I'm also having the same issue when running latest Ollama build from source (97c20ed) on my RX 6700 XT on Ubuntu Server 22.04. I thought setting the override and target as pointed out elsewhere would fix it, but it didn't for me. Running as root:
Output:
|
@m828 I know you're not using Ollama but I hope this helps somehow. This was the missing piece for me: ollama/ollama#3107 (comment)
Except instead of
|
It appears I accidentally copied gfx1030 from provided build command when it's supposed to be gfx1102 for me. Changing it fixed the issue. |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
What happened?
ggml_cuda_compute_forward: RMS_NORM failed
CUDA error: invalid device function
current device: 0, in function ggml_cuda_compute_forward at ggml/src/ggml-cuda.cu:2288
err
GGML_ASSERT: ggml/src/ggml-cuda.cu:101: !"CUDA error"
[New LWP 252]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007dc7bf87142f in __GI___wait4 (pid=255, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
30 ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory.
#0 0x00007dc7bf87142f in __GI___wait4 (pid=255, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
30 in ../sysdeps/unix/sysv/linux/wait4.c
#1 0x0000647041457f0b in ggml_print_backtrace ()
#2 0x000064704132bb47 in ggml_cuda_error(char const*, char const*, char const*, int, char const*) ()
#3 0x00006470413300ea in ggml_backend_cuda_graph_compute(ggml_backend*, ggml_cgraph*) ()
#4 0x00006470414a41d6 in ggml_backend_sched_graph_compute_async ()
#5 0x00006470414fdd7a in llama_decode ()
#6 0x00006470415ca265 in llama_init_from_gpt_params(gpt_params&) ()
#7 0x000064704131315e in main ()
[Inferior 1 (process 251) detached]
Name and Version
./llama-cli -m models/ggml-meta-llama-3-8b-Q4_K_M.gguf -p "You are a helpful assistant" -cnv -c 512 --n-gpu-layers 99
AMD Radeon RX 6700 XT
According to the online process, the compiled environment(HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)"
cmake -S . -B build -DGGML_HIPBLAS=ON -DAMDGPU_TARGETS=gfx1030 -DCMAKE_BUILD_TYPE=Release
&& cmake --build build --config Release -- -j 16;
export HSA_OVERRIDE_GFX_VERSION=10.3.0
export HIP_VISIBLE_DEVICES=0;)
What operating system are you seeing the problem on?
No response
Relevant log output
No response
The text was updated successfully, but these errors were encountered: