Fix trainer saving safetensors: metadata is None #28219

hiyouga · 2023-12-23T08:10:28Z

What does this PR do?

Fixes hiyouga/LLaMA-Factory#1959

If we use Trainer to train a model that does not belong to the PreTrainedModel class, such as the PreTrainedModelwithValuehead from the TRL library, the trainer will not save the metadata. This leads to errors in reading the metadata when using AutoModelforCausalLM.from_pretrained to load the model.

transformers/src/transformers/trainer.py

Line 2911 in 29e7a1e

    
           safetensors.torch.save_file(state_dict, os.path.join(output_dir, SAFE_WEIGHTS_NAME))

transformers/src/transformers/modeling_utils.py

Lines 3403 to 3407 in 29e7a1e

    
           with safe_open(resolved_archive_file, framework="pt") as f: 
        
               metadata = f.metadata() 
        
           if metadata.get("format") == "pt": 
        
               pass

Although it may sound strange to load a model that does not belong to the PreTrainedModel class using AutoModelForCausalLM.from_pretrained, this approach benefits model loading by utilizing features such as low_cpu_mem_usage if the model checkpoints share the same structure.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@muellerzr @pacman100

pacman100

Thank you @hiyouga for these changes which are inline with what the save_pretrained does at

transformers/src/transformers/modeling_utils.py

Line 2406 in 3cefac1

    
           safe_save_file(shard, os.path.join(save_directory, shard_file), metadata={"format": "pt"})

LGTM!

amyeroberts

Thanks for fixing!

* Update trainer.py * format

hiyouga added 2 commits December 23, 2023 15:39

Update trainer.py

84df580

format

ebe57e7

pacman100 approved these changes Dec 27, 2023

View reviewed changes

pacman100 requested review from amyeroberts and ArthurZucker December 27, 2023 06:08

amyeroberts approved these changes Jan 2, 2024

View reviewed changes

amyeroberts merged commit 502a10a into huggingface:main Jan 2, 2024
21 checks passed

Saibo-creator pushed a commit to epfl-dlab/transformers-GCD-PR that referenced this pull request Jan 4, 2024

Fix trainer saving safetensors: metadata is None (huggingface#28219)

62911a3

* Update trainer.py * format

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix trainer saving safetensors: metadata is None #28219

Fix trainer saving safetensors: metadata is None #28219

hiyouga commented Dec 23, 2023 •

edited

Loading

pacman100 left a comment

amyeroberts left a comment

	with safe_open(resolved_archive_file, framework="pt") as f:
	metadata = f.metadata()

	if metadata.get("format") == "pt":
	pass

Fix trainer saving safetensors: metadata is None #28219

Fix trainer saving safetensors: metadata is None #28219

Conversation

hiyouga commented Dec 23, 2023 • edited Loading

What does this PR do?

Before submitting

Who can review?

pacman100 left a comment

Choose a reason for hiding this comment

amyeroberts left a comment

Choose a reason for hiding this comment

hiyouga commented Dec 23, 2023 •

edited

Loading