-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix backward rope after YaRN #3974
Merged
ggerganov
merged 6 commits into
ggerganov:master
from
xaedes:fix-backward-rope-after-yarn
Nov 7, 2023
Merged
Fix backward rope after YaRN #3974
ggerganov
merged 6 commits into
ggerganov:master
from
xaedes:fix-backward-rope-after-yarn
Nov 7, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
rope backward process was broken after YaRN RoPE (ggerganov#2268) implementation, due to missing changes in backward functions. the code for the backward process is nearly identically to the forward process: the only difference is the sign of the sin-values. to avoid future regressions remove the near-duplicate backward functions and reuse the forward code: for this a new function argument `bool forward` was added to `ggml_compute_forward_rope_f32` and `ggml_compute_forward_rope_f16`. the sin-values will be negated when forward is false.
it is better to have only one `ggml_rope_back` function that accepts all rope parameters, so that `ggml_compute_backward` can propagate all parameters without having to switch between different rope_back variants.
cebtenzzre
approved these changes
Nov 6, 2023
xpos is only implemented in ggml_compute_forward_rope_f32 - we should probably fix that. |
It's probably a good idea to rename function after that change to not only mention forward. Like |
ggerganov
approved these changes
Nov 7, 2023
olexiyb
pushed a commit
to Sanctum-AI/llama.cpp
that referenced
this pull request
Nov 23, 2023
* fix backward process of rope rope backward process was broken after YaRN RoPE (ggerganov#2268) implementation, due to missing changes in backward functions. the code for the backward process is nearly identically to the forward process: the only difference is the sign of the sin-values. to avoid future regressions remove the near-duplicate backward functions and reuse the forward code: for this a new function argument `bool forward` was added to `ggml_compute_forward_rope_f32` and `ggml_compute_forward_rope_f16`. the sin-values will be negated when forward is false. * fix finetune rope call to use correct default attn_factor of 1.0f * remove unused `ggml_rope_xpos_back` it is better to have only one `ggml_rope_back` function that accepts all rope parameters, so that `ggml_compute_backward` can propagate all parameters without having to switch between different rope_back variants. * fix comments explaining the sinus sign in ggml_forward_rope * add missing function arguments in declaration * fix function argument type in declaration
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The rope backward process was broken after YaRN RoPE (#2268) implementation, due to missing changes in the backward functions.
The code for the rope backward process is nearly identically to the forward process:
the only difference is the sign of the
sin
-values.To avoid future regressions this PR removes the near-duplicate backward functions and reuses the forward code:
For this a new function argument
bool forward
was added toggml_compute_forward_rope_f32
andggml_compute_forward_rope_f16
. The sin-values will be negated when forward is false.Additionally the rope parameters in
ggml_compute_backward
are now correctly propagated to the backward process.