Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vulkan: possible NaN propagation on llama-3 8B (more testing required) #6874

Closed
stduhpf opened this issue Apr 24, 2024 · 2 comments
Closed

Comments

@stduhpf
Copy link
Contributor

stduhpf commented Apr 24, 2024

Sometimes when playing around with the new Llama-3 models with the Vulkan backend (on the server example) I ended up in a situation where the model would suddenly start generating complete gibberish. Once this happens, the server keeps generating garbage only, even when evaluating a new prompt that used to work before.

A server restart fixes the output. (until the next time it happens)

My setup:
GPU: Vulkan device: AMD Radeon RX 5700 XT | uma: 0 | fp16: 1 | warp size: 64 (gfx 1010),
OS: Windows 10 22H2

I suspect some operations are randomly generating NaNs, which stay even after clearing the KV cache. Reminds me a bit of #5243, except it doesn't always happen.

I'll try to build a simple setup to consistently cause this issue.

Edit: I can't find a new prompt that causes that problem, and I can't really share the one I already have, if I try to remove the sensitive information, it doesn't cause the issue anymore... The one I have consiently crashes the Llama-3-8B base model (tested with Q3_K_S/Q3_K_M/Q4_K_S) , but not the instruct model. No issue with the same prompt on other backends.

@Kartoffelsaft
Copy link
Contributor

I have encountered this both with both mistral-7b-instruct-v0.1.Q4_K_M.gguf running on an Intel UHD 620 (Manjaro Linux) and mistral-7b-instruct-v0.2-code-ft.Q4_K_M.gguf on a GTX 1070 (Arch Linux).

Some example output of the former:

This is a conversation between User and Llama, a friendly chatbot. Llama is helpful, kind, honest, good at writing, and never fails to answer any requests immediately and with precision.User: Hey Llama, do you think I should go to the gym today?

Llama: Of course! Going to the gym can help improve your physical health and overall well-being. It's also a great way to relieve stress and boost your mood. So if you have time and feel like it, definitely go to the gym today!

User: That sounds good! But I'm not sure what exercises to do.

Llama: No worries! There are many different types of exercises that can benefit your body in various ways. Some popular options include cardio (like running or cycling), strength training (using weights or resistance bands), and flexibility exercises (such as yoga or stretching). It's best to try a mix of these types of activities to get the most benefits from your workouts.

User: Okay, I think I'll start with some cardio and strength training today.

Llama: Great choice! Cardiovascular exercise will helpMASKironmentrieienenpositoryMillis‟ntilienenienenjesjerTransitionmansienenCHANTnageikirencynvieler cumironmentpanisysynersiiiernaMM‟nergy地animTransition Danslingsienenielpanic naturanimoganMENT pananim庄artersienenimpseventanimeedironmentzasventatonlgalanersribleyygy Hijpushilleryironmentanelsubscribe memorpositorymansbrisironmentminipageathonvent candMENTMASK Hawaii cumismusielnvpanic togetsisrieisyielsironmentCHANTikinagesy‟nergyTabIndex predictionssubscribeCHANTielruppeiel DansernaMASKnagenersmy cumventienennageielTransitionjesanimartersventiiipanawieedjerpanel paniiielalalersoganMASKienenanimielventpositoryrencyrible naturanellingsntilMENTminipagemansienenieler HawaiimansimpsebrisanimMMisyanimienenironmentzasikiMillisienenyyironment庄‟nvgy Dansielsruppe地 predictionsTabIndexathonCHANTanimernasubscribe cum cumienennersrie

I compiled with vulkan (no docker, if that happens to matter; doubt it though) and passing -ngl 9999. I however don't need to fully restart the server to fix it, restarting the prompt works just fine. It however always does eventually generate pure gibberish though.

Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants