-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
train-text-from-scratch and finetune nan loss on iter=2 #3940
Labels
bug
Something isn't working
Comments
Same here, finetune loss goes to nan on iter=2, windows build, with and without cublas, avx2. |
I have the same behavior on Haswell (AVX2) followed by segmentation fault.
GDB on segmentation fault (don't really know where to look):
Address sanitizers report 2 errors:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I was trying out the finetune example with my model but it kept going into nan loss. I eventually tried train-text-from-scratch, following the instructions on the README there and it goes into nan as well. I've reproduced this on two machines.
I've bisected this and 898aeca is the first bad commit. Reverting to the previous commit, c43c2da, train-text-from-scratch and finetune appear to work fine (they don't go into nan)
The text was updated successfully, but these errors were encountered: