Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Clamp out of range values in K quantizer
This assertion fails when quantizing Mixtral 8x7b as Q5_K_M, because I used `convert.py --outtype f32` and the Mixtral weights use bf16 which has a much larger exponent range than the K quantizer is expecting. If --outtype f16 is used then the assert doesn't fail. See ggerganov/llama.cpp#2982 cc: @JohannesGaessler
- Loading branch information
ef0307e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, it seems my assumptions about model weight ranges were incorrect. I really did not expect individual weights to be this large.