Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] Merge LoRAs #7213

Merged
merged 4 commits into from
Mar 7, 2024
Merged

[docs] Merge LoRAs #7213

merged 4 commits into from
Mar 7, 2024

Conversation

stevhliu
Copy link
Member

@stevhliu stevhliu commented Mar 4, 2024

Continuation of #7110 to implement option 3 for combining all merge methods into one doc and cleaning up the other docs to focus on loading or basic usage.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@@ -165,77 +147,6 @@ list_adapters_component_wise
{"text_encoder": ["toy", "pixel"], "unet": ["toy", "pixel"], "text_encoder_2": ["toy", "pixel"]}
```

## Compatibility with `torch.compile`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this going away? 😱

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops sorry I forgot about this section! Added it back in as a nested section under the fuse_lora section 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little uncomfortable about how it's ported in the new section.

Left a comment here: https://github.com/huggingface/diffusers/pull/7213/files#r1513791539.

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks so NEAT!

@pacman100 for awareness.

Comment on lines 235 to 249
### torch.compile

[torch.compile](../optimization/torch2.0#torchcompile) can speed up your pipeline even more, but you have to make sure the LoRA weights are fused first and then unloaded.

```py
pipeline = DiffusionPipeline.from_pretrained(
"username/fused-ikea-feng", torch_dtype=torch.float16,
).to("cuda")

pipeline.unet.to(memory_format=torch.channels_last)
pipeline.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)

image = pipeline("A bowl of ramen shaped like a cute kawaii bear, by Feng Zikai", generator=torch.manual_seed(0)).images[0]
image
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this section is quite incomplete.

  1. It doesn't load any LoRA.
  2. It doesn't apply fusion.
  3. It doesn't display the generated image.

Could we please ensure the section is ported as closely as possible? torch.compile() compatibility with PEFT is super important for our users.

Copy link
Member Author

@stevhliu stevhliu Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see changes here

About points 1-2:

Since the torch.compile section is already nested under the fuse_lora section, it is assumed that this is a subtopic of fuse_lora meaning you should read that section first. Adding the same code from the section above feels redundant. Do you think it'd be more helpful/clear if I added something saying "Make sure you read thefuse_lora section above first".

About point 3:

I think it'd be more impactful to show the speed up in time rather than another image. wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel relatively strongly about points 1 - 2. Even if a single step is missed torch.compile() can fail during inference.

In the fuse_lora() section, the step of the code shows pipeline.unfuse_lora(). This will make torch.compile() fail. The right sequence of the steps would be:

# provided `set_adapters()` was already called. 
pipe.fuse_lora()
pipe.unload_lora_weights()

pipe.unet.to(memory_format=torch.channels_last)
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)

So, I would appreciate it if the example code snippet for torch.compile() when doing LoRA fusion was as self-contained and as complete as possible.

For pt. 3, the timings report will be inappropriate since it needs warming up. So, without that it will give an improper view of the timing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! I went with your suggestion to include an entirely self-contained code snippet 🙂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Steven! Appreciate it.

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's fix the torch.compile() section and then we can ship!

@sayakpaul
Copy link
Member

Feel free to merge it. Looking solid and thanks a mile for beating it with me.

@stevhliu stevhliu merged commit 3ce905c into huggingface:main Mar 7, 2024
1 check passed
@stevhliu stevhliu deleted the merge-loras branch March 7, 2024 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants