-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vulkan: possible NaN propagation on llama-3 8B (more testing required) #6874
Comments
I have encountered this both with both mistral-7b-instruct-v0.1.Q4_K_M.gguf running on an Intel UHD 620 (Manjaro Linux) and mistral-7b-instruct-v0.2-code-ft.Q4_K_M.gguf on a GTX 1070 (Arch Linux). Some example output of the former:
I compiled with vulkan (no docker, if that happens to matter; doubt it though) and passing |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Sometimes when playing around with the new Llama-3 models with the Vulkan backend (on the server example) I ended up in a situation where the model would suddenly start generating complete gibberish. Once this happens, the server keeps generating garbage only, even when evaluating a new prompt that used to work before.
A server restart fixes the output. (until the next time it happens)
My setup:
GPU:
Vulkan device: AMD Radeon RX 5700 XT | uma: 0 | fp16: 1 | warp size: 64
(gfx 1010),OS: Windows 10 22H2
I suspect some operations are randomly generating NaNs, which stay even after clearing the KV cache. Reminds me a bit of #5243, except it doesn't always happen.
I'll try to build a simple setup to consistently cause this issue.
Edit: I can't find a new prompt that causes that problem, and I can't really share the one I already have, if I try to remove the sensitive information, it doesn't cause the issue anymore... The one I have consiently crashes the Llama-3-8B base model (tested with Q3_K_S/Q3_K_M/Q4_K_S) , but not the instruct model. No issue with the same prompt on other backends.
The text was updated successfully, but these errors were encountered: