Skip to content

Commit

Permalink
updated RoPE statement (#423)
Browse files Browse the repository at this point in the history
* updated RoPE statement

* updated .gitignore

* Update ch05/07_gpt_to_llama/converting-gpt-to-llama2.ipynb

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
  • Loading branch information
d-kleine and rasbt authored Oct 30, 2024
1 parent b5f2aa3 commit 81eed9a
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 1 deletion.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ ch05/07_gpt_to_llama/Llama-3.1-8B
ch05/07_gpt_to_llama/Llama-3.1-8B-Instruct
ch05/07_gpt_to_llama/Llama-3.2-1B
ch05/07_gpt_to_llama/Llama-3.2-1B-Instruct
ch05/07_gpt_to_llama/Llama-3.2-3B
ch05/07_gpt_to_llama/Llama-3.2-3B-Instruct

ch06/01_main-chapter-code/gpt2
ch06/02_bonus_additional-experiments/gpt2
Expand Down
2 changes: 1 addition & 1 deletion ch05/07_gpt_to_llama/converting-gpt-to-llama2.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -409,7 +409,7 @@
"self.pos_emb = nn.Embedding(cfg[\"context_length\"], cfg[\"emb_dim\"])\n",
"```\n",
"\n",
"- Instead of these absolute positional embeddings, Llama uses relative positional embeddings, called rotary position embeddings (RoPE for short)\n",
"- Unlike traditional absolute positional embeddings, Llama uses rotary position embeddings (RoPE), which enable it to capture both absolute and relative positional information simultaneously\n",
"- The reference paper for RoPE is [RoFormer: Enhanced Transformer with Rotary Position Embedding (2021)](https://arxiv.org/abs/2104.09864)"
]
},
Expand Down

0 comments on commit 81eed9a

Please sign in to comment.