Bug: Garbage outputs on Vulkan backend since #10301 (possible NaN issue?) #10434
Labels
bug-unconfirmed
high severity
Used to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)
What happened?
Since commit b3e5859 (PR #10301), The vulkan backend tends to glitch out after a few tokens, this happened to me a few times with llama-3 8B, and happens consistently with qwen2.5-Coder 0.5B. My guess is that there must be invalid values like NaN or Inf appearing somewhere during computation of softmax.
How to reproduce.
Command:
.\build\bin\Release\llama-cli.exe -m E:\Downloads\Qwen2.5-Coder-0.5B.f16.gguf -ngl 99 -t 6 -tb 12 -p "Hello, I'm a" --seed 0 -n 512
Output:
Expected output (commit 557924f)
Git bisect results
Full logs
Name and Version
.\build\bin\Release\llama-cli.exe --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 5700 XT (AMD proprietary driver) | uma: 0 | fp16: 1 | warp size: 64
version: 4128 (b3e5859)
built with MSVC 19.41.34120.0 for x64
What operating system are you seeing the problem on?
Windows
The text was updated successfully, but these errors were encountered: