Fix YaRN ramp calculation and add --yarn-orig-ctx #2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This fixes a subtle bug in the YaRN implementation. When calculating the linear ramp, we're attempting to replicate this code:
So, when
min == max
, we wantmax - min = 0.001
. The code currently calculates a particular entry oflinear_func
asBut, when
high - low == 0
,min(0.001, 0) = 0
, not0.001
. The fix is to change themin
to amax
.I've also added in the code to be able to set
--yarn-orig-ctx
from the command line, so that models such as TheBloke/Yarn-Llama-2-7B-64K-GGUF which were converted without the GGUF YaRN keys in them can still be used (if the correct values are passed on the command line).