Add llama3 #30334

ArthurZucker · 2024-04-19T06:19:12Z

TODOs

docs/source/en/model_doc/llama3.md

HuggingFaceDocBuilderDev · 2024-04-19T06:43:14Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…lama3

devdevgoat · 2024-04-23T17:34:48Z

Managed to get this commit working, but it requires setting flag --llama_version 3 when calling convert_llama_wieghts_to_hf.py

python src/transformers/models/llama/convert_llama_weights_to_hf.py \
--input_dir /mnt/nvme1n1/Models/llama3/Meta-Llama-3-8B-Instruct \
--model_size 8B \
--output_dir /mnt/nvme1n1/Models/llama3/Meta-Llama-3-8B-Instruct-hf \
--llama_version 3

Additionally, it tries to delete the tmp folder before it's empty, so throws an error at the end, even though conversion is successful.

ArthurZucker · 2024-04-23T21:13:41Z

it requires setting flag --llama_version 3 when calling convert_llama_wieghts_to_hf.py

that is expected since you are converting version3, but should still work for earlier versions.

kisseternity · 2024-04-24T02:31:12Z

Hello, I notice the mask procedure ensuring self-attention does not cross document boundaries in meta's llama3 blog, as the following:

Should llama3 have an extra attention mask for this?

ArthurZucker · 2024-04-24T07:09:44Z

In transformers we usually let the user create the custom mask. The trainer might even support it. but no it's not a modeling code change we want 😉 you can already pass any 4d or 2d mask to Llama

LysandreJik

LGTM

LysandreJik · 2024-04-24T07:11:21Z

docs/source/en/model_doc/llama3.md

+<Tip warning={true}>
+
+The `Llama3` models were trained using `bfloat16`, but the original inference uses `float16`. The checkpoints uploaded on the Hub use `torch_dtype = 'float16'`, which will be
+used by the `AutoModel` API to cast the checkpoints from `torch.float32` to `torch.float16`. 
+
+The `dtype` of the online weights is mostly irrelevant unless you are using `torch_dtype="auto"` when initializing a model using `model = AutoModelForCausalLM.from_pretrained("path", torch_dtype = "auto")`. The reason is that the model will first be downloaded ( using the `dtype` of the checkpoints online), then it will be casted to the default `dtype` of `torch` (becomes `torch.float32`), and finally, if there is a `torch_dtype` provided in the config, it will be used. 
+
+Training the model in `float16` is not recommended and is known to produce `nan`; as such, the model should be trained in `bfloat16`.
+
+</Tip>


…lama3

ArthurZucker · 2024-04-24T07:56:04Z

Failing tests are unrelated

lhanchao777 · 2024-04-26T02:55:01Z

I meet this error when using the convert_llama_weights_to_hf.py. ImportError: cannot import name 'TikTokenConverter' from 'transformers.convert_slow_tokenizer', how to solve it ? The version of my transformers is 4.40.1

LysandreJik · 2024-04-26T06:44:43Z

Hello @lhanchao777, conversion scripts should be used with a source install of transformers. You can install from source with the following:

pip install git+https://github.com/huggingface/transformers

You could also clone the repo and add it as an editable installation:

git clone https://github.com/huggingface/transformers
pip install -e ./transformers

ZJL0111 · 2024-04-29T09:33:13Z

I meet this error when using the convert_llama_weights_to_hf.py. ImportError: cannot import name 'TikTokenConverter' from 'transformers.convert_slow_tokenizer', how to solve it ? The version of my transformers is 4.40.1

Hi, @lhanchao777 i download model from modelscop and meet the same problem when trans model format;
To solve this, you need to instal transformer from source;
cause transformer source code uodated to 4.41.0, but transformer of huggingface updated to 4.40.1，so we need to install transformer from source;
And install from conda not pip

Transformers can be installed using conda as follows:
conda install conda-forge::transformers
NOTE: Installing transformers from the huggingface channel is deprecated.
# todo: https://github.com/huggingface/transformers/tree/main

but even thoughi solve this problem, i still get other bugs which i have no clue; so finally i still get llama3 direclly from huggingface

``import transformers
import torch
from huggingface_hub import login
login(token="your access token")

model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
pipeline = transformers.pipeline("text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.float16},
device="cuda",
)
pipeline("Hey how are you doing today?")``

hope it helps!

danieltmeta · 2024-04-29T20:22:39Z

Hi there! Not sure if this is the right place, but I'm trying to convert the 8B model to llama2, is that possible by changing the flag in this fashion:

python src/transformers/models/llama/convert_llama_weights_to_hf.py
--input_dir /path/to/downloaded/llama/weights --model_size 8B --output_dir /output/path --llama_version 2

--llama_version 3 works fine, but that result is not what I'm looking for in my application.

I keep getting this error:

RuntimeError: Internal: could not parse ModelProto from ./Meta-Llama-3-8B/original/tokenizer.model

Thanks!

ZJL0111 · 2024-04-30T02:01:48Z

you want llama2, but your error path is llama "./Meta-Llama-3-8B/original/tokenizer.model", check your path please

…

________________________________ 发件人: danieltmeta ***@***.***> 发送时间: 2024年4月30日 4:23 收件人: huggingface/transformers ***@***.***> 抄送: ZJL0111 ***@***.***>; Comment ***@***.***> 主题: Re: [huggingface/transformers] Add llama3 (PR #30334) Hi there! Not sure if this is the right place, but I'm trying to convert the 8B model to llama2, is that possible by changing the flag in this fashion: python src/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir /path/to/downloaded/llama/weights --model_size 8B --output_dir /output/path --llama_version 2

--llama_version 3 works fine, but that result is not what I'm looking for in my application. I keep getting this error: RuntimeError: Internal: could not parse ModelProto from ./Meta-Llama-3-8B/original/tokenizer.model Thanks! ― Reply to this email directly, view it on GitHub<#30334 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AEVZCQDAT35B76KQRCX7FTTY72T2LAVCNFSM6AAAAABGOSEOTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBTGU4TKOJWHE>. You are receiving this because you commented.Message ID: ***@***.***>

NekoMimiUnagi · 2024-04-30T04:41:23Z

Hi there! I followed the above instructions to convert Meta-Llama-3-8B to hf format but still got errors as follows:

Traceback (most recent call last):
  File "/data/models/origin-format/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py", line 407, in <module>
    main()
  File "/data/models/origin-format/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py", line 394, in main
    vocab_size = len(write_tokenizer(args.output_dir, spm_path, llama_version=args.llama_version))
  File "/data/models/origin-format/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py", line 358, in write_tokenizer
    tokenizer = Llama3Converter(input_tokenizer_path).tokenizer
  File "/data/models/origin-format/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py", line 319, in __init__
    tokenizer = self.converted()
  File "/data/models/origin-format/transformers/src/transformers/convert_slow_tokenizer.py", line 1534, in converted
    tokenizer = self.tokenizer()
  File "/data/models/origin-format/transformers/src/transformers/convert_slow_tokenizer.py", line 1527, in tokenizer
    vocab_scores, merges = self.extract_vocab_merges_from_model(self.vocab_file)
  File "/data/models/origin-format/transformers/src/transformers/convert_slow_tokenizer.py", line 1503, in extract_vocab_merges_from_model
    bpe_ranks = load_tiktoken_bpe(tiktoken_url)
  File "/home/xxx/miniconda3/envs/pytorch/lib/python3.8/site-packages/tiktoken/load.py", line 115, in load_tiktoken_bpe
    return {
  File "/home/xxx/miniconda3/envs/pytorch/lib/python3.8/site-packages/tiktoken/load.py", line 117, in <dictcomp>
    for token, rank in (line.split() for line in contents.splitlines() if line)
ValueError: too many values to unpack (expected 2)

I installed transformers from the source file under the folder /data/models/origin-format/transformers/; the version is 4.41.0.dev0. By the way, I follow the instruction to install it.

Hello @lhanchao777, conversion scripts should be used with a source install of transformers. You can install from source with the following:
pip install git+https://github.com/huggingface/transformers
You could also clone the repo and add it as an editable installation:
git clone https://github.com/huggingface/transformers
pip install -e ./transformers

ArthurZucker · 2024-04-30T07:44:45Z

You are most probably not using the correct original tokenizer.model. We proof tested the script many times 😉

danieltmeta · 2024-05-01T23:59:55Z

@ArthurZucker You are correct, I was not using this correctly. Thanks!

ucasyouzhao1987 · 2024-05-28T03:15:35Z

Hi there! Not sure if this is the right place, but I'm trying to convert the 8B model to llama2, is that possible by changing the flag in this fashion:

python src/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir /path/to/downloaded/llama/weights --model_size 8B --output_dir /output/path --llama_version 2

--llama_version 3 works fine, but that result is not what I'm looking for in my application.

I keep getting this error:

RuntimeError: Internal: could not parse ModelProto from ./Meta-Llama-3-8B/original/tokenizer.model

Thanks!

I also have the error "Internal: could not parse ModelProto from ./Meta-Llama-3-8B/original/tokenizer.model", I found the "Meta-Llama-3-8B-chat" don't have the "tokenizer.model", how to solve it ?

ArthurZucker · 2024-06-05T11:45:57Z

You should use PreTrainedTokenizerFast or just AutoTokenizer!

Weiriyue · 2024-06-18T07:36:47Z

Traceback (most recent call last):
File "C:\Users\81408\AppData\Roaming\Python\Python311\site-packages\transformers\convert_llama_weights_to_hf.py", line 407, in
main()
File "C:\Users\81408\AppData\Roaming\Python\Python311\site-packages\transformers\convert_llama_weights_to_hf.py", line 396, in main
write_model(
File "C:\Users\81408\AppData\Roaming\Python\Python311\site-packages\transformers\convert_llama_weights_to_hf.py", line 178, in write_model
f"model.layers.{layer_i}.self_attn.k_proj.weight": permute(
^^^^^^^^
File "C:\Users\81408\AppData\Roaming\Python\Python311\site-packages\transformers\convert_llama_weights_to_hf.py", line 154, in permute
return w.view(n_heads, dim1 // n_heads // 2, 2, dim2).transpose(1, 2).reshape(dim1, dim2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: shape '[32, 2, 2, 4096]' is invalid for input of size 16777216
How to solve it?

ArthurZucker · 2024-06-18T13:12:35Z

Could you share which version of transformers you are using? And which version of llama you are trying to convert

ArthurZucker mentioned this pull request Apr 19, 2024

convert_llama_weights_to_hf meta-llama/llama3#58

Closed

ArthurZucker force-pushed the add-llama3 branch from 5b8d6d8 to 233d8ad Compare April 22, 2024 13:21

nuke

6413e76

ArthurZucker force-pushed the add-llama3 branch from 233d8ad to 6413e76 Compare April 22, 2024 13:25

ArthurZucker and others added 3 commits April 22, 2024 15:26

Merge branch 'main' of github.com:huggingface/transformers into add-l…

b37027b

…lama3

add co-author

cfd61b4

add co-author

d0099e0

ArthurZucker marked this pull request as ready for review April 22, 2024 13:32

This was referenced Apr 22, 2024

Add Llama 3 support to convert_llama_weights_to_hf() #30388

Closed

llama3 to hf model conversion does not work meta-llama/llama-cookbook#445

Closed

calmitchell617 mentioned this pull request Apr 22, 2024

How to save a trained model so it can be loaded with HF from_pretrained()? pytorch/torchtune#832

Closed

Merge branch 'main' of github.com:huggingface/transformers into add-l…

93cea8a

…lama3

ArthurZucker added 2 commits April 23, 2024 23:25

update card

f4ecc06

fixup and fix copies to please our ci

cbdb998

ArthurZucker added 2 commits April 24, 2024 08:43

nit fixup

160ecc7

super small nits

43be2ea

ArthurZucker requested a review from LysandreJik April 24, 2024 07:10

LysandreJik approved these changes Apr 24, 2024

View reviewed changes

ArthurZucker added 3 commits April 24, 2024 09:28

remove tokenizer_path from call to write_model

7b7f65d

always safe serialize by default

3cbbcc9

Merge branch 'main' of github.com:huggingface/transformers into add-l…

2c65903

…lama3

ArthurZucker merged commit 89c510d into main Apr 24, 2024
15 of 23 checks passed

ArthurZucker deleted the add-llama3 branch April 24, 2024 08:11

Amitgb14 mentioned this pull request Apr 30, 2024

ImportError: cannot import name 'TikTokenConverter' from 'transformers.convert_slow_tokenizer' meta-llama/llama-cookbook#475

Closed

2 tasks

Galunid mentioned this pull request Jun 4, 2024

Bug: Does any convert.py supprt llama3? ggerganov/llama.cpp#7737

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add llama3 #30334

Add llama3 #30334

ArthurZucker commented Apr 19, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 19, 2024

devdevgoat commented Apr 23, 2024 •

edited

Loading

ArthurZucker commented Apr 23, 2024

kisseternity commented Apr 24, 2024

ArthurZucker commented Apr 24, 2024 •

edited

Loading

LysandreJik left a comment

LysandreJik Apr 24, 2024

ArthurZucker commented Apr 24, 2024

lhanchao777 commented Apr 26, 2024

LysandreJik commented Apr 26, 2024

ZJL0111 commented Apr 29, 2024 •

edited

Loading

danieltmeta commented Apr 29, 2024

ZJL0111 commented Apr 30, 2024 via email

NekoMimiUnagi commented Apr 30, 2024

ArthurZucker commented Apr 30, 2024

danieltmeta commented May 1, 2024

ucasyouzhao1987 commented May 28, 2024

ArthurZucker commented Jun 5, 2024

Weiriyue commented Jun 18, 2024

ArthurZucker commented Jun 18, 2024

Add llama3 #30334

Add llama3 #30334

Conversation

ArthurZucker commented Apr 19, 2024 • edited Loading

TODOs

HuggingFaceDocBuilderDev commented Apr 19, 2024

devdevgoat commented Apr 23, 2024 • edited Loading

ArthurZucker commented Apr 23, 2024

kisseternity commented Apr 24, 2024

ArthurZucker commented Apr 24, 2024 • edited Loading

LysandreJik left a comment

Choose a reason for hiding this comment

LysandreJik Apr 24, 2024

Choose a reason for hiding this comment

ArthurZucker commented Apr 24, 2024

lhanchao777 commented Apr 26, 2024

LysandreJik commented Apr 26, 2024

ZJL0111 commented Apr 29, 2024 • edited Loading

danieltmeta commented Apr 29, 2024

ZJL0111 commented Apr 30, 2024 via email

NekoMimiUnagi commented Apr 30, 2024

ArthurZucker commented Apr 30, 2024

danieltmeta commented May 1, 2024

ucasyouzhao1987 commented May 28, 2024

ArthurZucker commented Jun 5, 2024

Weiriyue commented Jun 18, 2024

ArthurZucker commented Jun 18, 2024

ArthurZucker commented Apr 19, 2024 •

edited

Loading

devdevgoat commented Apr 23, 2024 •

edited

Loading

ArthurZucker commented Apr 24, 2024 •

edited

Loading

ZJL0111 commented Apr 29, 2024 •

edited

Loading