You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
please tell us what kind of hardware can reproduce your error?
请告诉我们您报错的后端类型
[√ ] Ascend
GPU
CPU
Software Environment | 软件环境
MindSpore version:
请告诉我们您正在使用的MindSpore版本:
2.1
2.0.0
[ √] other (please state here): 2.3.0___
Python version(e.g., 3.7.5):3.9.18
OS(e.g., Linux Ubuntu 16.04)EulerOS 2.0 (SP8), CANN-8.0.RC1
GCC/Compiler version:7.3.0
Describe the current behavior | 目前输出
2024-08-14 17:10:00,056 - modelscope - INFO - PyTorch version 2.1.0 Found.
2024-08-14 17:10:00,061 - modelscope - INFO - Loading ast index from /home/ma-user/.cache/modelscope/ast_indexer
2024-08-14 17:10:00,123 - modelscope - INFO - Loading done! Current index file version is 1.15.0, with md5 54d31e3d3abbdd999283f7b24d7db88f and a total number of 980 components indexed
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/torch_npu/utils/path_manager.py:79: UserWarning: Warning: The /usr/local/Ascend/ascend-toolkit/latest owner does not match the current user.
warnings.warn(f"Warning: The {path} owner does not match the current user.")
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/torch_npu/utils/path_manager.py:79: UserWarning: Warning: The /usr/local/Ascend/ascend-toolkit/8.0.RC1/aarch64-linux/ascend_toolkit_install.info owner does not match the current user.
warnings.warn(f"Warning: The {path} owner does not match the current user.")
08/14/2024 17:10:57 - INFO - main - UNet2DConditionModel ==> Trainable params: 797,184 || All params: 860,318,148 || Trainable ratio: 0.09266153%
Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 297.08it/s]
You have disabled the safety checker for <class 'mindone.diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at huggingface/diffusers#254 .
08/14/2024 17:11:01 - INFO - main - ***** Running training *****
08/14/2024 17:11:01 - INFO - main - Num examples = 856
08/14/2024 17:11:01 - INFO - main - Num Epochs = 100
08/14/2024 17:11:01 - INFO - main - Instantaneous batch size per device = 1
08/14/2024 17:11:01 - INFO - main - Total train batch size (w. parallel, distributed & accumulation) = 1
08/14/2024 17:11:01 - INFO - main - Gradient Accumulation steps = 1
08/14/2024 17:11:01 - INFO - main - Total optimization steps = 85600
08/14/2024 17:11:01 - INFO - main - Running validation...
Generating 4 images with prompt: a man in a straw hat.
08/14/2024 17:15:17 - INFO - main - Validation done.
Steps: 0%| | 0/85600 [00:00<?, ?it/s]Traceback (most recent call last):
File "/home/ma-user/work/mindone/examples/diffusers/text_to_image/train_text_to_image_lora.py", line 955, in
main()
File "/home/ma-user/work/mindone/examples/diffusers/text_to_image/train_text_to_image_lora.py", line 790, in main
loss, model_pred = train_step(*batch)
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/nn/cell.py", line 703, in call
out = self.compile_and_run(*args, **kwargs)
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/nn/cell.py", line 1071, in compile_and_run
self.compile(*args, **kwargs)
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/nn/cell.py", line 1054, in compile
_cell_graph_executor.compile(self, *self._compile_args, phase=self.phase,
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/common/api.py", line 1819, in compile
result = self._graph_executor.compile(obj, args, kwargs, phase, self._use_vm_mode())
TypeError: getattr(): attribute name must be string but got: External
Hardware Environment | 硬件环境
请告诉我们您报错的后端类型
Ascend
GPU
CPU
Software Environment | 软件环境
请告诉我们您正在使用的MindSpore版本:
Describe the current behavior | 目前输出
2024-08-14 17:10:00,056 - modelscope - INFO - PyTorch version 2.1.0 Found.
2024-08-14 17:10:00,061 - modelscope - INFO - Loading ast index from /home/ma-user/.cache/modelscope/ast_indexer
2024-08-14 17:10:00,123 - modelscope - INFO - Loading done! Current index file version is 1.15.0, with md5 54d31e3d3abbdd999283f7b24d7db88f and a total number of 980 components indexed
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/torch_npu/utils/path_manager.py:79: UserWarning: Warning: The /usr/local/Ascend/ascend-toolkit/latest owner does not match the current user.
warnings.warn(f"Warning: The {path} owner does not match the current user.")
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/torch_npu/utils/path_manager.py:79: UserWarning: Warning: The /usr/local/Ascend/ascend-toolkit/8.0.RC1/aarch64-linux/ascend_toolkit_install.info owner does not match the current user.
warnings.warn(f"Warning: The {path} owner does not match the current user.")
08/14/2024 17:10:57 - INFO - main - UNet2DConditionModel ==> Trainable params: 797,184 || All params: 860,318,148 || Trainable ratio: 0.09266153%
Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 297.08it/s]
You have disabled the safety checker for <class 'mindone.diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing
safety_checker=None
. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at huggingface/diffusers#254 .08/14/2024 17:11:01 - INFO - main - ***** Running training *****
08/14/2024 17:11:01 - INFO - main - Num examples = 856
08/14/2024 17:11:01 - INFO - main - Num Epochs = 100
08/14/2024 17:11:01 - INFO - main - Instantaneous batch size per device = 1
08/14/2024 17:11:01 - INFO - main - Total train batch size (w. parallel, distributed & accumulation) = 1
08/14/2024 17:11:01 - INFO - main - Gradient Accumulation steps = 1
08/14/2024 17:11:01 - INFO - main - Total optimization steps = 85600
08/14/2024 17:11:01 - INFO - main - Running validation...
Generating 4 images with prompt: a man in a straw hat.
08/14/2024 17:15:17 - INFO - main - Validation done.
Steps: 0%| | 0/85600 [00:00<?, ?it/s]Traceback (most recent call last):
File "/home/ma-user/work/mindone/examples/diffusers/text_to_image/train_text_to_image_lora.py", line 955, in
main()
File "/home/ma-user/work/mindone/examples/diffusers/text_to_image/train_text_to_image_lora.py", line 790, in main
loss, model_pred = train_step(*batch)
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/nn/cell.py", line 703, in call
out = self.compile_and_run(*args, **kwargs)
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/nn/cell.py", line 1071, in compile_and_run
self.compile(*args, **kwargs)
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/nn/cell.py", line 1054, in compile
_cell_graph_executor.compile(self, *self._compile_args, phase=self.phase,
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/common/api.py", line 1819, in compile
result = self._graph_executor.compile(obj, args, kwargs, phase, self._use_vm_mode())
TypeError: getattr(): attribute name must be string but got: External
mindspore/ccsrc/pipeline/jit/ps/static_analysis/prim.cc:3874 EvalPrim
0 In file /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindone/diffusers/training_utils.py:713-728, 8-45
if self.sync_gradients:
1 In file /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindone/diffusers/training_utils.py:710, 25-59
outputs, grads = self.forward_and_backward(*inputs)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 In file /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/ops/composite/base.py:589, 31-56
return grad_(fn, weights)(*args)
^~~~~~~~~~~~~~~~~~~~~~~~~
3 In file /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/ops/composite/base.py:589, 37-39
return grad_(fn, weights)(*args)
^~
4 In file /home/ma-user/work/mindone/examples/diffusers/text_to_image/train_text_to_image_lora.py:923-928, 8-95
if self.noise_scheduler_prediction_type == "epsilon":
5 In file /home/ma-user/work/mindone/examples/diffusers/text_to_image/train_text_to_image_lora.py:924, 12-26
target = noise
^~~~~~~~~~~~~~
6 In file /home/ma-user/work/mindone/examples/diffusers/text_to_image/train_text_to_image_lora.py:933-948, 8-30
if self.args.snr_gamma is None:
^
7 In file /home/ma-user/work/mindone/examples/diffusers/text_to_image/train_text_to_image_lora.py:915, 24-81
noisy_latents = self.noise_scheduler.add_noise(latents, noise, timesteps)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
8 In file /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindone/diffusers/schedulers/scheduling_ddpm.py:515, 25-44
alphas_cumprod = self.alphas_cumprod.to(dtype=original_samples.dtype)
^~~~~~~~~~~~~~~~~~~
9 In file /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindone/diffusers/configuration_utils.py:132, 61-107
is_in_config = "_internal_dict" in self.dict and hasattr(self.dict["_internal_dict"], name)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
10 In file /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/_extends/parse/standard_method.py:384, 10-40
out = getattr(x, attr, mstype._null)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
(See file '/home/ma-user/work/mindone/examples/diffusers/text_to_image/rank_0/om/analyze_fail.ir' for more details. Get instructions about
analyze_fail.ir
at https://www.mindspore.cn/search?inputValue=analyze_fail.ir)Steps: 0%| | 0/85600 [00:10<?, ?it/s]
Describe the expected behavior | 期望输出
1、正常训练成功
Steps to reproduce the issue | 复现报错的步骤
执行python train_text_to_image_lora.py --pretrained_model_name_or_path=/home/ma-user/work/stable-diffusion-v1-4/ --dataset_name=/home/ma-user/work/onepiece-blip-captions/ --resolution=512 --center_crop --random_flip --train_batch_size=1 --num_train_epochs=100 --checkpointing_steps=5000 --learning_rate=1e-04 --lr_scheduler="constant" --lr_warmup_steps=0 --mixed_precision="fp16" --seed=42 --validation_prompt="a man in a straw hat" --output_dir="sd-onepiece-model-lora-$(date +%Y%m%d%H%M%S)"
The text was updated successfully, but these errors were encountered: