minor fixes: Llama 3.2 standalone #420

d-kleine · 2024-10-26T00:24:10Z

minor fixes and improvements for Llama 3.2 standalone nb

review-notebook-app · 2024-10-26T00:24:14Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

d-kleine · 2024-10-26T00:34:42Z

If you find some spare time, it would be great if you could implement these formatting changes to this figure:

Bold print seems to indicate what has been changed from before, so

Llama 3 8B: "32 heads" should not be bold print (no change)
Llama 3 8B: "Supported context length of 131k tokens" should be bold print (change)

I really enjoy these notebooks and figures around GPT-2 and the LLama 2/3 models here - it's a great round-up of the contents of the book both technically (code) and visually (figures)!

rasbt · 2024-10-26T02:07:49Z

Good catch regarding the 72. I also reformatted the RoPE base as float to make it consistent with the other RoPE float settings.
(I synced the figure, but it may be a few hours until the change takes effect due to GitHub's caching.)

d-kleine · 2024-10-26T15:35:25Z

@rasbt Thanks! I just took a look into the updated figure, the "32 heads" of Llama 3 8B are still bold print.

Also, I have seen another information in the figure that might need an update:

Llama 2 7B: Here, it says that RoPE captures the relative positional embeddings (do you mean rotary?). But RoPE does not only captures relative positions, otherwise Meta could have used Sinusoidal PE. RoPE captures both absolute and relative positional information simultaneously, which makes it superior over relative-only PE.
(I think it would also be useful to add the info that RoPE will only be applied on queries and keys, but not on values - in the figure itself, this is not clear)

You could also add the information that Llama 2 already used GQA for the larger models (34b and 70b) for improved inferencescalability. I think this an interesting information for the figure.

rasbt · 2024-10-27T19:34:49Z

Thanks, I will try to update it in the next few days!

rasbt · 2024-10-30T02:32:08Z

Looks like I had fixed the "heads" in the Llama figure but then forgot to apply it to some of the figures where it's used as subfigure. Good call regarding the RoPE btw. Should be taken care of now!

d-kleine · 2024-10-30T04:55:14Z

Looks great, thanks! Superb comprehensive overview btw!

minor fixes

2578b30

d-kleine marked this pull request as ready for review October 26, 2024 00:35

reformat rope base as float

29a8138

rasbt merged commit e8c2f96 into rasbt:main Oct 26, 2024
8 checks passed

d-kleine deleted the llama32 branch October 26, 2024 02:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

minor fixes: Llama 3.2 standalone #420

minor fixes: Llama 3.2 standalone #420

d-kleine commented Oct 26, 2024

review-notebook-app bot commented Oct 26, 2024

d-kleine commented Oct 26, 2024

rasbt commented Oct 26, 2024

d-kleine commented Oct 26, 2024 •

edited

Loading

rasbt commented Oct 27, 2024

rasbt commented Oct 30, 2024

d-kleine commented Oct 30, 2024

minor fixes: Llama 3.2 standalone #420

minor fixes: Llama 3.2 standalone #420

Conversation

d-kleine commented Oct 26, 2024

review-notebook-app bot commented Oct 26, 2024

d-kleine commented Oct 26, 2024

rasbt commented Oct 26, 2024

d-kleine commented Oct 26, 2024 • edited Loading

rasbt commented Oct 27, 2024

rasbt commented Oct 30, 2024

d-kleine commented Oct 30, 2024

d-kleine commented Oct 26, 2024 •

edited

Loading