-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
quantize: k_quants.c:73: nearest_int: Assertion `fval <= 4194303.f' failed. #2982
Comments
#2434 might fix this if implemented? |
Does #3010 solve it? In order to get this assertion, all weights in a block of 256 must be zero. This has never happened before, so I wonder how meaningful this model is. Although I'm somewhat surprised to see @KerfuffleV2 No, #2434 will not be solving it. The zeros (or |
My mistake. I was thinking it was a model that just had particularly large values in the weights. |
This assertion fails when quantizing Mixtral 8x7b as Q5_K_M, because I used `convert.py --outtype f32` and the Mixtral weights use bf16 which has a much larger exponent range than the K quantizer is expecting. If --outtype f16 is used then the assert doesn't fail. See ggerganov/llama.cpp#2982 cc: @JohannesGaessler
This assertion fails when quantizing Mixtral 8x7b as Q5_K_M, because I used `convert.py --outtype f32` and the Mixtral weights use bf16 which has a much larger exponent range than the K quantizer is expecting. If --outtype f16 is used then the assert doesn't fail. See ggerganov#2982
While trying to quantize Huginn-22b-Prototype to Q5_0, I ran into this assertion failure while quantizing the output tensor:
It happens here:
The text was updated successfully, but these errors were encountered: