LoRA from train_dreambooth_lora_sdxl.py is not working in A1111 anymore #6894

patryk-bartkowiak-nitid · 2024-02-07T14:20:01Z

Describe the bug

I have been using train_dreambooth_lora_sdxl.py and convert_diffusers_sdxl_lora_to_webui.py to train LoRA for specific character, It was working till like a week ago. I am using the same baseline model and the same data.

I realized that previous size of all the LoRA files had 29967176 bytes, now it has 29889672 and less keys in dict after I load it as pure .safetensors file.

I realized that it works fine with inference guide in README:

import torch
from diffusers import DiffusionPipeline

pretrained_model = "./pretrained_models/dreamshaper-xl"
lora_weights = "./outputs/dreamshaper-xl_claire/checkpoint-2000/"

prompt = "photo of wff woman, sitting in train"
negative_prompt = "text, watermark, low quality, medium quality, blurry, censored, wrinkles, deformed, mutated text, watermark, low quality, medium quality, blurry, censored, wrinkles, deformed, mutated"

pipe = DiffusionPipeline.from_pretrained(pretrained_model, torch_dtype=torch.float16)
pipe = pipe.to("cuda")
pipe.load_lora_weights(lora_weights)

image = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=50,
    seed=420,
).images[0]

image.save("lora_inference.png")

But after I convert and load to A1111 (it loads correctly) it doesnt work anymore, looks like its adding some noise to the output only.

I already tried checkpointing to previous commits on diffusers, torch and torchvision, but nothing really helps. I am still not able to use LoRA in A1111.

Reproduction

Code to train LoRA:

export MODEL_NAME="pretrained_models/dreamshaper-xl"
export INSTANCE_DIR="data/claire"
export MAX_TRAIN_STEPS=5000
export CHECKPOINTING_STEPS=500


export OUTPUT_DIR="outputs/$(basename ${MODEL_NAME})_$(basename ${INSTANCE_DIR})_tmp"
export CUDA_LAUNCH_BLOCKING=1
export TORCH_USE_CUDA_DSA=1

printf "\n\nTraining Claire model with $MODEL_NAME on $INSTANCE_DIR, saving to $OUTPUT_DIR\n\n"

accelerate launch diffusers/examples/dreambooth/train_dreambooth_lora_sdxl.py \
	--instance_prompt="photo of wff woman, isolated on white background" \
	--pretrained_model_name_or_path=$MODEL_NAME \
	--instance_data_dir=$INSTANCE_DIR \
	--output_dir=$OUTPUT_DIR \
	--resolution=1024 \
	--train_batch_size=2 \
	--gradient_accumulation_steps=4 \
	--learning_rate=1e-4 \
	--lr_scheduler="constant" \
	--lr_warmup_steps=0 \
	--max_train_steps=$MAX_TRAIN_STEPS \
	--seed="0" \
	--train_text_encoder \
	--enable_xformers_memory_efficient_attention \
	--gradient_checkpointing \
	--use_8bit_adam \
	--checkpointing_steps=$CHECKPOINTING_STEPS

Code to convert to A1111 format

python /project/diffusers/scripts/convert_diffusers_sdxl_lora_to_webui.py {input_path} {output_path}

Logs

Can't really post any errors, looks like typical image generation, no errors or warning during training and conversion

System Info

- `diffusers` version: 0.26.0.dev0
- Platform: Linux-5.15.0-92-generic-x86_64-with-glibc2.27
- Python version: 3.10.9
- PyTorch version (GPU?): 2.0.0 (True)
- Huggingface_hub version: 0.20.3
- Transformers version: 4.37.2
- Accelerate version: 0.26.1
- xFormers version: 0.0.19
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

Who can help?

@yiyixuxu @sayakpaul @DN6 @patrickvonplaten

The text was updated successfully, but these errors were encountered:

sayakpaul · 2024-02-08T04:11:21Z

Thanks for the detailed thread. Can you pin me a version that was working as expected for you?

I am asking because none of those scripts went through significant logical changes in the past 7 days.

patryk-bartkowiak-nitid · 2024-02-08T07:16:49Z

Yeah that's the thing, I am unable to restore the environment perfectly and I'm blocked right now, not sure where the issue is :/

sayakpaul · 2024-02-08T07:18:18Z

Ah then it's a bit of a pity. In any case, please do ping me here if you're able to give me a pinpointed version. I am happy to look further from there :-)

patryk-bartkowiak-nitid · 2024-02-08T07:20:26Z

Anyway going through README guide it's not working properly, I am happy to meet or whatever to solve this issue :)

sayakpaul · 2024-02-08T07:22:15Z

README guide? Do you mean the commands from https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/README_sdxl.md don't work? Can you provide a fully reproducible snippet for me?

I am happy to meet or whatever to solve this issue :)

Sorry, we cannot do that. As maintainers, we need to be cognizant of our time and keep the discussions as open as possible,

patryk-bartkowiak-nitid · 2024-02-08T07:44:51Z

I mean command from
https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/README_sdxl.md
combined with
https://github.com/huggingface/diffusers/blob/main/scripts/convert_diffusers_sdxl_lora_to_webui.py

Not sure on what part of the pipeline there is an issue, like I said I am able to use LoRA using code for inference that you provided in README, but can't correctly convert it. Might be both the conversion itself or LoRA has some different properties that conversion script can't handle.

Let me send you full pipeline for you to reproduce the issue, I will try to include as many details as possible:

Create VM with this docker image: pytorch/pytorch:2.0.0-cuda11.7-cudnn8-devel
Install dependencies:

apt update
apt install vim git tmux ffmpeg libsm6 libxext6 wget python3 python3-venv libgl1 libglib2.0-0 google-perftools -y

git clone https://github.com/huggingface/diffusers.git
cd diffusers
pip install -e .
cd examples/dreambooth
pip install -r requirements.txt
accelerate config default
pip install bitsandbytes xformers==0.0.19

Download baseline SDXL model:

wget https://civitai.com/api/download/models/333449 -O DreamShaperXL.safetensors

Convert .safetensors to suitable format using python:

import diffusers
pipe = diffusers.StableDiffusionXLPipeline.from_single_file("DreamShaperXL.safetensors")
pipe.save_pretrained("DreamShaperXL")

Train LoRA (6 images with the same woman on white background):

export MODEL_NAME="DreamShaperXL"
export INSTANCE_DIR="data/claire"
export MAX_TRAIN_STEPS=5000
export CHECKPOINTING_STEPS=500


export OUTPUT_DIR="outputs/$(basename ${MODEL_NAME})_$(basename ${INSTANCE_DIR})"
export CUDA_LAUNCH_BLOCKING=1
export TORCH_USE_CUDA_DSA=1

printf "\n\nTraining Claire model with $MODEL_NAME on $INSTANCE_DIR, saving to $OUTPUT_DIR\n\n"

accelerate launch diffusers/examples/dreambooth/train_dreambooth_lora_sdxl.py \
	--instance_prompt="photo of wff woman, isolated on white background" \
	--pretrained_model_name_or_path=$MODEL_NAME \
	--instance_data_dir=$INSTANCE_DIR \
	--output_dir=$OUTPUT_DIR \
	--resolution=1024 \
	--train_batch_size=2 \
	--gradient_accumulation_steps=4 \
	--learning_rate=1e-4 \
	--lr_scheduler="constant" \
	--lr_warmup_steps=0 \
	--max_train_steps=$MAX_TRAIN_STEPS \
	--seed="0" \
	--train_text_encoder \
	--enable_xformers_memory_efficient_attention \
	--gradient_checkpointing \
	--use_8bit_adam \
	--checkpointing_steps=$CHECKPOINTING_STEPS

Convert to Kohya format:

python /diffusers/scripts/convert_diffusers_sdxl_lora_to_webui.py outputs/DreamShaperXL_claire/pytorch_lora_weights.safetensors test.safetensors

Move to A1111:

mv test.safetensors stable-diffusion-webui/models/Lora/

sayakpaul · 2024-02-08T07:54:25Z

As mentioned I need to know a version that was working as expected for you.

CC: @linoytsaban @apolinario here.

patryk-bartkowiak-nitid · 2024-02-08T07:57:28Z

Well because I can't really provide it - can we just focus on the current version that is probably not working properly?

I was also considering A1111 to not work, but I am able to work with my previous LoRA's so I think it has to be something in this pipeline

sayakpaul · 2024-02-08T07:59:40Z

That makes it thousand times more difficult for us to make progress here actually, hence I am a bit adamant on it. To be able to pinpoint the issue -- can we say the trained LoRA provides expected results when the inference is done from diffusers?

Your initial issue description suggests so. So, I quite suspect that it's the conversion script that's the culprit here.

patryk-bartkowiak-nitid · 2024-02-08T08:02:57Z

Yes, LoRA provides expected results when the inference is done from diffusers.

When it's done in A1111 it actually changes the output image (same seed), but not in a way that it should, looks like its just adding some noise at the beginning of the generation process. I will send an example in 3 minutes

sayakpaul · 2024-02-08T08:04:30Z

Then it's quite likely that the conversion script is the problem as mentioned. So, I will let @apolinario and @linoytsaban comment further (as they are the developers of that script).

patryk-bartkowiak-nitid · 2024-02-08T08:10:45Z

A1111 Config:

photo of wff woman, rides gondola in Venice,
Negative prompt: text, watermark, low quality, medium quality, blurry, censored, wrinkles, deformed, mutated text, watermark, low quality, medium quality, blurry, censored, wrinkles, deformed, mutated, BadDream, UnrealisticDream
Steps: 7, Sampler: DPM++ SDE Karras, CFG scale: 2, Seed: 420, Size: 1024x1024, Model hash: 676f0d60c8, Model: DreamShaperXL, Version: v1.7.0

Image without any LoRA:

Image with previously trained LoRA that works - trained for 8000 iterations with batch_size=1:

Image with new LoRA - trained for 4000 iterations with batch_size=2:

patryk-bartkowiak-nitid · 2024-02-08T10:24:09Z

Also adding an image generated locally with new LoRA that doesn't work in A1111 - trained for 4000 iterations with batch_size=2

Code to generate:

import torch
from diffusers import DiffusionPipeline

pretrained_model = "DreamShaperXL"
lora_weights = "./outputs/DreamShaperXL_claire/checkpoint-4000/"

prompt = "photo of wff woman, rides gondola in Venice,"
negative_prompt = "text, watermark, low quality, medium quality, blurry, censored, wrinkles, deformed, mutated text, watermark, low quality, medium quality, blurry, censored, wrinkles, deformed, mutated"

pipe = DiffusionPipeline.from_pretrained(pretrained_model, torch_dtype=torch.float32)
pipe = pipe.to("cuda")
pipe.load_lora_weights(lora_weights)

image = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=50,
    seed=420,
).images[0]

image.save("lora_inference.png")

Image:

Note

As you can see it's much closer, of course quality is not good enough because in AUTOMATIC1111 there are some additional things that make it look better like negative embeddings etc.

patryk-bartkowiak-nitid · 2024-02-08T12:58:34Z

Update

I tried to load exact same model after the conversion in ComfyUI and it works properly, but I found this issue from a week ago: #6777

Do you think it's related? Did any of LoRA keys changed? Looks like A1111 do not support it yet

sayakpaul · 2024-02-08T13:23:41Z

Could be related but the LoRA keys didn’t change. We have got multiple tests ensuring that.

linoytsaban · 2024-02-08T13:38:49Z

Hey @patryk-bartkowiak-nitid, thanks for creating this issue! Just to make sure I understand, right now comfyUI conversion works fine but A111 doesn't?

patryk-bartkowiak-nitid · 2024-02-08T14:08:00Z

Hey @patryk-bartkowiak-nitid, thanks for creating this issue! Just to make sure I understand, right now comfyUI conversion works fine but A111 doesn't?

Exactly

linoytsaban · 2024-02-08T14:54:03Z

Hmm, I'm not sure what have caused this since we haven't made any changes to the conversion script, and the changes made to the training script should not affect that. @sayakpaul was there any change in the peft keys maybe that would make the conversion script incompatible?

sayakpaul · 2024-02-08T14:55:30Z

No, I don’t think so. There were no changes to the training script or the underlying utils that would lead to key incompatibilities.

patryk-bartkowiak-nitid · 2024-02-08T15:04:28Z

Could this have had an impact? #6895

sayakpaul · 2024-02-08T15:06:22Z

Pretty sure not as it only touches the model card which has nothing to do with the state dict.

patryk-bartkowiak-nitid · 2024-02-09T11:18:11Z

Any ideas @sayakpaul @linoytsaban ? Still trying to figure this out

sayakpaul · 2024-02-09T11:37:57Z

Sorry but I don't work with A1111 or ComfyUI either. And I cannot offer any help related to conversion to non-diffusers formats right now.

linoytsaban · 2024-02-09T13:29:22Z

@patryk-bartkowiak-nitid can you check the state_dict of the previous Loras that worked fine on A1111 and the new ones and see if there are differences (assuming there are if it's incompatible) and what are they?

patryk-bartkowiak-nitid · 2024-02-09T14:49:05Z

I compared converted .safetensors files and already worked on restoring the exact same structure, this is how I restored it so you can see the difference between them:

before = load_file("claire.safetensors")
after = load_file("test.safetensors")

for k in after.keys():
    v = after[k]

    del after[k]

    k = k.replace("lora.down", "lora_down")
    k = k.replace("lora.up", "lora_up")
    k = k.replace("to_k_lora", "to_k.lora")
    k = k.replace("_lora_down", ".lora_down")
    k = k.replace("_lora_up", ".lora_up")

    after[k] = v

for layer_name in [x for x in after.keys() if x.endswith("lora_up.weight")]:
    layer_name = layer_name.replace("lora_up.weight", "alpha")
    layer_name = layer_name.replace("_alpha", ".alpha")
    after[layer_name] = torch.tensor(4)

Now I got two .safetensors files with exact same keys and shapes, but different values in weights ofc

patryk-bartkowiak-nitid · 2024-02-09T14:51:47Z

Before:

intersection = set(before.keys()) & set(after.keys())

len(before), len(after), len(intersection)

(2208, 1648, 528)

After

intersection = set(before.keys()) & set(after.keys())

len(before), len(after), len(intersection)

(2208, 2208, 2208)

qwerdf4 · 2024-03-03T10:07:20Z

I also encountered the same problem

github-actions · 2024-03-27T15:03:01Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

yiyixuxu · 2024-03-27T23:02:53Z

@sayakpaul
is this the fix? #7435

sayakpaul · 2024-03-28T01:03:00Z

Yeah could be.

github-actions · 2024-04-21T15:04:22Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

yiyixuxu · 2024-04-22T21:55:39Z

assuming fixed in #7435
let us know if it is still an issue, and we will reopen this!

patryk-bartkowiak-nitid added the bug Something isn't working label Feb 7, 2024

yiyixuxu assigned sayakpaul Feb 7, 2024

yiyixuxu added the conversion script label Feb 17, 2024

github-actions bot added the stale Issues that haven't received updates label Mar 27, 2024

yiyixuxu removed the stale Issues that haven't received updates label Mar 27, 2024

github-actions bot added the stale Issues that haven't received updates label Apr 21, 2024

yiyixuxu closed this as completed Apr 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LoRA from train_dreambooth_lora_sdxl.py is not working in A1111 anymore #6894

LoRA from train_dreambooth_lora_sdxl.py is not working in A1111 anymore #6894

patryk-bartkowiak-nitid commented Feb 7, 2024 •

edited

Loading

sayakpaul commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024

sayakpaul commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024

sayakpaul commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024 •

edited

Loading

sayakpaul commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024

sayakpaul commented Feb 8, 2024 •

edited

Loading

patryk-bartkowiak-nitid commented Feb 8, 2024

sayakpaul commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024

sayakpaul commented Feb 8, 2024

linoytsaban commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024

linoytsaban commented Feb 8, 2024

sayakpaul commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024

sayakpaul commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 9, 2024

sayakpaul commented Feb 9, 2024

linoytsaban commented Feb 9, 2024

patryk-bartkowiak-nitid commented Feb 9, 2024 •

edited

Loading

patryk-bartkowiak-nitid commented Feb 9, 2024 •

edited

Loading

qwerdf4 commented Mar 3, 2024

github-actions bot commented Mar 27, 2024

yiyixuxu commented Mar 27, 2024

sayakpaul commented Mar 28, 2024

github-actions bot commented Apr 21, 2024

yiyixuxu commented Apr 22, 2024

LoRA from train_dreambooth_lora_sdxl.py is not working in A1111 anymore #6894

LoRA from train_dreambooth_lora_sdxl.py is not working in A1111 anymore #6894

Comments

patryk-bartkowiak-nitid commented Feb 7, 2024 • edited Loading

Describe the bug

Reproduction

Code to train LoRA:

Code to convert to A1111 format

Logs

System Info

Who can help?

sayakpaul commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024

sayakpaul commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024

sayakpaul commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024 • edited Loading

sayakpaul commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024

sayakpaul commented Feb 8, 2024 • edited Loading

patryk-bartkowiak-nitid commented Feb 8, 2024

sayakpaul commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024

Code to generate:

Image:

Note

patryk-bartkowiak-nitid commented Feb 8, 2024

Update

sayakpaul commented Feb 8, 2024

linoytsaban commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024

linoytsaban commented Feb 8, 2024

sayakpaul commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 8, 2024

sayakpaul commented Feb 8, 2024

patryk-bartkowiak-nitid commented Feb 9, 2024

sayakpaul commented Feb 9, 2024

linoytsaban commented Feb 9, 2024

patryk-bartkowiak-nitid commented Feb 9, 2024 • edited Loading

patryk-bartkowiak-nitid commented Feb 9, 2024 • edited Loading

Before:

After

qwerdf4 commented Mar 3, 2024

github-actions bot commented Mar 27, 2024

yiyixuxu commented Mar 27, 2024

sayakpaul commented Mar 28, 2024

github-actions bot commented Apr 21, 2024

yiyixuxu commented Apr 22, 2024

patryk-bartkowiak-nitid commented Feb 7, 2024 •

edited

Loading

patryk-bartkowiak-nitid commented Feb 8, 2024 •

edited

Loading

sayakpaul commented Feb 8, 2024 •

edited

Loading

patryk-bartkowiak-nitid commented Feb 9, 2024 •

edited

Loading

patryk-bartkowiak-nitid commented Feb 9, 2024 •

edited

Loading