FileNotFoundError: [Errno 2] No such file ./accuracies.png' #3225

Katehuuh · 2024-04-10T15:05:34Z

Reminder

I have read the README and searched the existing issues.

Reproduction

Training LoRa ORPO, LoRa was saved, not prior issue.

Log from commit caf8373

...
{'loss': 1.9317, 'grad_norm': 219.4456024169922, 'learning_rate': 3.412407220904079e-11, 'rewards/chosen': -0.18838071823120117, 'rewards/rejected': -0.7961876392364502, 'rewards/accuracies': 0.800000011920929, 'rewards/margins': 0.6078068614006042, 'logps/rejected': -7.961875915527344, 'logps/chosen': -1.8838069438934326, 'logits/rejected': -1.2117503881454468, 'logits/chosen': -1.287316918373108, 'sft_loss': 1.8838069438934326, 'odds_ratio_loss': 0.47844308614730835, 'epoch': 1.0}
{'loss': 0.8209, 'grad_norm': 6.5753984451293945, 'learning_rate': 1.5166263899191182e-11, 'rewards/chosen': -0.08103003352880478, 'rewards/rejected': -1.135803461074829, 'rewards/accuracies': 1.0, 'rewards/margins': 1.0547735691070557, 'logps/rejected': -11.35803508758545, 'logps/chosen': -0.8103002309799194, 'logits/rejected': -1.3508708477020264, 'logits/chosen': -1.3301522731781006, 'sft_loss': 0.8103002309799194, 'odds_ratio_loss': 0.10593117773532867, 'epoch': 1.0}
{'loss': 0.9022, 'grad_norm': 5.922697067260742, 'learning_rate': 3.7915674122590565e-12, 'rewards/chosen': -0.08883383125066757, 'rewards/rejected': -0.6862779855728149, 'rewards/accuracies': 1.0, 'rewards/margins': 0.5974441766738892, 'logps/rejected': -6.8627800941467285, 'logps/chosen': -0.8883382678031921, 'logits/rejected': -1.1119892597198486, 'logits/chosen': -1.4197337627410889, 'sft_loss': 0.8883382678031921, 'odds_ratio_loss': 0.1383415162563324, 'epoch': 1.0}
{'loss': 1.4268, 'grad_norm': 9.195427894592285, 'learning_rate': 0.0, 'rewards/chosen': -0.14172907173633575, 'rewards/rejected': -0.7929937243461609, 'rewards/accuracies': 1.0, 'rewards/margins': 0.6512646675109863, 'logps/rejected': -7.929937839508057, 'logps/chosen': -1.4172906875610352, 'logits/rejected': -1.5984668731689453, 'logits/chosen': -1.2286322116851807, 'sft_loss': 1.4172906875610352, 'odds_ratio_loss': 0.09547214210033417, 'epoch': 1.0}
100%|██████████████████████████████████████████████████████████████████████████| 12855/12855 [7:44:26<00:00,  1.59s/it][INFO|trainer.py:2231] 2024-04-10 13:28:41,972 >>

Training completed. Do not forget to share your model on huggingface.co/models =)


{'train_runtime': 27866.917, 'train_samples_per_second': 0.461, 'train_steps_per_second': 0.461, 'train_loss': 1.0734075052920875, 'epoch': 1.0}
100%|██████████████████████████████████████████████████████████████████████████| 12855/12855 [7:44:26<00:00,  2.17s/it]
[INFO|trainer.py:3203] 2024-04-10 13:28:42,014 >> Saving model checkpoint to saves\LLaMA2-13B-Chat\lora\LoRa_ORPO
C:\LLaMA-Factory\venv\lib\site-packages\peft\utils\save_and_load.py:154: UserWarning: Could not find a config file in C:\LLaMA-Factory\checkpoints\Llama-2-13b-chat-hf - will assume that the vocabulary was not modified.
  warnings.warn(
[INFO|tokenization_utils_base.py:2502] 2024-04-10 13:28:43,665 >> tokenizer config file saved in saves\LLaMA2-13B-Chat\lora\LoRa_ORPO\tokenizer_config.json
[INFO|tokenization_utils_base.py:2511] 2024-04-10 13:28:43,667 >> Special tokens file saved in saves\LLaMA2-13B-Chat\lora\LoRa_ORPO\special_tokens_map.json
***** train metrics *****
  epoch                    =        1.0
  train_loss               =     1.0734
  train_runtime            = 7:44:26.91
  train_samples_per_second =      0.461
  train_steps_per_second   =      0.461
Figure saved at: saves\LLaMA2-13B-Chat\lora\LoRa_ORPO\training_loss.png
04/10/2024 13:28:44 - WARNING - llmtuner.extras.ploting - No metric eval_loss to plot.
Traceback (most recent call last):
  File "C:\LLaMA-Factory\src\train_bash.py", line 14, in <module>
    main()
  File "C:\LLaMA-Factory\src\train_bash.py", line 5, in main
    run_exp()
  File "C:\LLaMA-Factory\src\llmtuner\train\tuner.py", line 41, in run_exp
    run_orpo(model_args, data_args, training_args, finetuning_args, callbacks)
  File "C:\LLaMA-Factory\src\llmtuner\train\orpo\workflow.py", line 59, in run_orpo
    plot_loss(training_args.output_dir, keys=["loss", "eval_loss", "rewards/accuracies", "sft_loss"])
  File "C:\LLaMA-Factory\src\llmtuner\extras\ploting.py", line 56, in plot_loss
    plt.savefig(figure_path, format="png", dpi=100)
  File "C:\LLaMA-Factory\venv\lib\site-packages\matplotlib\pyplot.py", line 1134, in savefig
    res = fig.savefig(*args, **kwargs)  # type: ignore[func-returns-value]
  File "C:\LLaMA-Factory\venv\lib\site-packages\matplotlib\figure.py", line 3390, in savefig
    self.canvas.print_figure(fname, **kwargs)
  File "C:\LLaMA-Factory\venv\lib\site-packages\matplotlib\backend_bases.py", line 2193, in print_figure
    result = print_method(
  File "C:\LLaMA-Factory\venv\lib\site-packages\matplotlib\backend_bases.py", line 2043, in <lambda>
    print_method = functools.wraps(meth)(lambda *args, **kwargs: meth(
  File "C:\LLaMA-Factory\venv\lib\site-packages\matplotlib\backends\backend_agg.py", line 497, in print_png
    self._print_pil(filename_or_obj, "png", pil_kwargs, metadata)
  File "C:\LLaMA-Factory\venv\lib\site-packages\matplotlib\backends\backend_agg.py", line 446, in _print_pil
    mpl.image.imsave(
  File "C:\LLaMA-Factory\venv\lib\site-packages\matplotlib\image.py", line 1656, in imsave
    image.save(fname, **pil_kwargs)
  File "C:\LLaMA-Factory\venv\lib\site-packages\PIL\Image.py", line 2436, in save
    fp = builtins.open(filename, "w+b")
FileNotFoundError: [Errno 2] No such file or directory: 'saves\\LLaMA2-13B-Chat\\lora\\LoRa_ORPO\\training_rewards/accuracies.png'

Expected behavior

No response

System Info

No response

Others

No response

The text was updated successfully, but these errors were encountered:

hiyouga · 2024-04-10T15:58:07Z

fixed

@marko1616

* fix packages * Update wechat.jpg * Updated README with new information * Updated README with new information * Updated README with new information * Follow HF_ENDPOINT environment variable * fix hiyouga#2346 * fix hiyouga#2777 hiyouga#2895 * add orca_dpo_pairs dataset * support fsdp + qlora * update readme * update tool extractor * paper release * add citation * move file * Update README.md, fix the release date of the paper * Update README_zh.md, fix the release date of the paper * Update wechat.jpg * fix hiyouga#2941 * fix hiyouga#2928 * fix hiyouga#2936 * fix Llama lora merge crash * fix Llama lora merge crash * fix Llama lora merge crash * pass ruff check * tiny fix * Update requirements.txt * Update README_zh.md * release v0.6.0 * add arg check * Update README_zh.md * Update README.md * update readme * tiny fix * release v0.6.0 (real) * Update wechat.jpg * fix hiyouga#2961 * fix bug * fix hiyouga#2981 * fix ds optimizer * update trainers * fix hiyouga#3010 * update readme * fix hiyouga#2982 * add project * update readme * release v0.6.1 * Update wechat.jpg * fix pile datset hf hub url * upgrade gradio to 4.21.0 * support save args in webui hiyouga#2807 hiyouga#3046 some ideas are borrowed from @marko1616 * Fix Llama model save for full param train * fix blank line contains whitespace * tiny fix * support ORPO * support orpo in webui * update readme * use log1p in orpo loss huggingface/trl#1491 * fix plots * fix IPO and ORPO loss * fix ORPO loss * update webui * support infer 4bit model on GPUs hiyouga#3023 * fix hiyouga#3077 * add qwen1.5 moe * fix hiyouga#3083 * set dev version * Update SECURITY.md * fix hiyouga#3022 * add moe aux loss control hiyouga#3085 * simplify readme * update readme * update readme * update examples * update examples * add zh readme * update examples * update readme * update vllm example * Update wechat.jpg * fix hiyouga#3116 * fix resize vocab at inference hiyouga#3022 * fix requires for windows * fix bug in latest gradio * back to gradio 4.21 and fix chat * tiny fix * update examples * update readme * support Qwen1.5-32B * support Qwen1.5-32B * fix spell error * support hiyouga#3152 * rename template to breeze * rename template to breeze * add empty line * Update wechat.jpg * tiny fix * fix quant infer and qwen2moe * Pass additional_target to unsloth Fixes hiyouga#3200 * Update adapter.py * Update adapter.py * fix hiyouga#3225 --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Co-authored-by: 刘一博 <liuyibo@khazics-MacBook-Pro.local> Co-authored-by: khazic <khazzz1c@gmail.com> Co-authored-by: SirlyDreamer <45280500+SirlyDreamer@users.noreply.github.com> Co-authored-by: Sanjay Nadhavajhala <sanjay@acorn.io> Co-authored-by: sanjay920 <sanjay.nadhavajhala@gmail.com> Co-authored-by: 0xez <110299556+0xez@users.noreply.github.com> Co-authored-by: marko1616 <marko1616@outlook.com> Co-authored-by: Remek Kinas <62574431+rkinas@users.noreply.github.com> Co-authored-by: Tsumugii24 <2792474059@qq.com> Co-authored-by: li.yunhao <li.yunhao@foxmail.com> Co-authored-by: sliderSun <291952004@qq.com> Co-authored-by: codingma <codingma@163.com> Co-authored-by: Erich Schubert <kno10@users.noreply.github.com>

hiyouga added the solved This problem has been already solved label Apr 10, 2024

hiyouga closed this as completed in a99f5ed Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FileNotFoundError: [Errno 2] No such file ./accuracies.png' #3225

FileNotFoundError: [Errno 2] No such file ./accuracies.png' #3225

Katehuuh commented Apr 10, 2024

hiyouga commented Apr 10, 2024

FileNotFoundError: [Errno 2] No such file ./accuracies.png' #3225

FileNotFoundError: [Errno 2] No such file ./accuracies.png' #3225

Comments

Katehuuh commented Apr 10, 2024

Reminder

Reproduction

Expected behavior

System Info

Others

hiyouga commented Apr 10, 2024