train_text_to_image_sdxl.py ema not working #5783

Jack000 · 2023-11-13T17:42:54Z

Describe the bug

I did a few training runs with train_text_to_image_sdxl.py before realizing that the EMA checkpoint never changed.

Looking at train_text_to_image.py - it seems that there is no ema_unet.step() in the sdxl training code so the ema model is never updated.

it's also missing this bit at the start:

if args.use_ema:
    ema_unet.to(accelerator.device)

Reproduction

just the huggingface sample code

Logs

No response

System Info

latest version of diffusers

Who can help?

No response

The text was updated successfully, but these errors were encountered:

linnanwang · 2023-11-14T02:16:41Z

I felt the same issue here.

patrickvonplaten · 2023-11-14T10:44:38Z

Can you please add a reproducible code snippet?

Jack000 · 2023-11-15T07:47:28Z

I guess just the sample code, but with --use_ema turned on
from https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/README_sdxl.md

export MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0"
export VAE_NAME="madebyollin/sdxl-vae-fp16-fix"
export DATASET_NAME="lambdalabs/pokemon-blip-captions"

accelerate launch train_text_to_image_sdxl.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --pretrained_vae_model_name_or_path=$VAE_NAME \
  --dataset_name=$DATASET_NAME \
  --enable_xformers_memory_efficient_attention \
  --resolution=512 --center_crop --random_flip \
  --proportion_empty_prompts=0.2 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 --gradient_checkpointing \
  --max_train_steps=10000 \
  --use_8bit_adam \
  --learning_rate=1e-06 --lr_scheduler="constant" --lr_warmup_steps=0 \
  --mixed_precision="fp16" \
  --report_to="wandb" \
  --validation_prompt="a cute Sundar Pichai creature" --validation_epochs 5 \
  --checkpointing_steps=5000 \
  --output_dir="sdxl-pokemon-model" \
  --use_ema

this will error out with "RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!"

because it's missing

if args.use_ema:
    ema_unet.to(accelerator.device)

yiyixuxu · 2023-11-15T18:38:37Z

hi @Jack000:
are you interested in open a PR?

github-actions · 2023-12-26T15:07:28Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

sayakpaul · 2024-01-05T02:33:10Z

So sorry for the delay here. Apologies. Could you please submit a PR fixing the issue? Looks like you have already found the bug.

github-actions · 2024-01-29T15:06:50Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

sayakpaul · 2024-01-29T15:11:56Z

Gentle bump :-)

tolgacangoz · 2024-02-21T13:57:29Z

Since nobody has done that for a while; I tried.

Jack000 added the bug Something isn't working label Nov 13, 2023

github-actions bot added the stale Issues that haven't received updates label Dec 26, 2023

tolgacangoz mentioned this issue Feb 21, 2024

Fix EMA in train_text_to_image_sdxl.py #7048

Merged

yiyixuxu removed the stale Issues that haven't received updates label Feb 22, 2024

yiyixuxu closed this as completed in #7048 Feb 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train_text_to_image_sdxl.py ema not working #5783

train_text_to_image_sdxl.py ema not working #5783

Jack000 commented Nov 13, 2023 •

edited

Loading

linnanwang commented Nov 14, 2023

patrickvonplaten commented Nov 14, 2023

Jack000 commented Nov 15, 2023

yiyixuxu commented Nov 15, 2023

github-actions bot commented Dec 26, 2023

sayakpaul commented Jan 5, 2024

github-actions bot commented Jan 29, 2024

sayakpaul commented Jan 29, 2024

tolgacangoz commented Feb 21, 2024 •

edited

Loading

train_text_to_image_sdxl.py ema not working #5783

train_text_to_image_sdxl.py ema not working #5783

Comments

Jack000 commented Nov 13, 2023 • edited Loading

Describe the bug

Reproduction

Logs

System Info

Who can help?

linnanwang commented Nov 14, 2023

patrickvonplaten commented Nov 14, 2023

Jack000 commented Nov 15, 2023

yiyixuxu commented Nov 15, 2023

github-actions bot commented Dec 26, 2023

sayakpaul commented Jan 5, 2024

github-actions bot commented Jan 29, 2024

sayakpaul commented Jan 29, 2024

tolgacangoz commented Feb 21, 2024 • edited Loading

Jack000 commented Nov 13, 2023 •

edited

Loading

tolgacangoz commented Feb 21, 2024 •

edited

Loading