Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Intro page of TCD #7259

Merged
merged 9 commits into from
Mar 13, 2024
Merged

Add Intro page of TCD #7259

merged 9 commits into from
Mar 13, 2024

Conversation

mhh0318
Copy link
Contributor

@mhh0318 mhh0318 commented Mar 8, 2024

What does this PR do?

Fixes # (issue)

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@sayakpaul

Hi Sayak, I've updated an intro page for TCD. Please have a review.

@yiyixuxu yiyixuxu requested review from sayakpaul and stevhliu March 8, 2024 22:38
Comment on lines 21 to 25
> ***Better than Teacher:*** TCD maintains superior generative quality at both low NFEs and high NFEs, even exceeding the performance of DPM-Solver++(2S) with origin SDXL. It is worth noting that there is no additional discriminator or LPIPS supervision included during training.

> ***Flexible NFEs:*** The NFEs for TCD sampling can be varied at will without adversely affecting the quality of the results.

> ***Freely Change the Detailing:*** During inference, the level of detail in the image can be simply modified by adjusing one hyper-parameter gamma. This option does not require the introduction of any additional parameters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit): Could we maybe make these bullet points?


From the [Official Project Page](https://mhh0318.github.io/tcd/), the major merit of TCD can be outlined as follows:

> ***Better than Teacher:*** TCD maintains superior generative quality at both low NFEs and high NFEs, even exceeding the performance of DPM-Solver++(2S) with origin SDXL. It is worth noting that there is no additional discriminator or LPIPS supervision included during training.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reader doesn't yet know what NFE is. Also, it's worth hyperlinking DPM-Solver++(2S).


From the [Official Project Page](https://mhh0318.github.io/tcd/), the major merit of TCD can be outlined as follows:

> ***Better than Teacher:*** TCD maintains superior generative quality at both low NFEs and high NFEs, even exceeding the performance of DPM-Solver++(2S) with origin SDXL. It is worth noting that there is no additional discriminator or LPIPS supervision included during training.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> ***Better than Teacher:*** TCD maintains superior generative quality at both low NFEs and high NFEs, even exceeding the performance of DPM-Solver++(2S) with origin SDXL. It is worth noting that there is no additional discriminator or LPIPS supervision included during training.
> ***Better than Teacher:*** TCD maintains superior generative quality at both low NFEs and high NFEs, even exceeding the performance of DPM-Solver++(2S) with Stable Diffusion XL (SDXL). It is worth noting that no additional discriminator or LPIPS supervision is included during training.


For more technical details of TCD, please refer to [the paper](https://arxiv.org/abs/2402.19159).

Trajectory consistency distillation can directly place on top of a pre-trained diffusion model as a LoRA module. Such LoRA can be identified as a versatile acceleration module applicable to different fine-tuned models or LoRAs sharing the same base model without the need for additional training.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Trajectory consistency distillation can directly place on top of a pre-trained diffusion model as a LoRA module. Such LoRA can be identified as a versatile acceleration module applicable to different fine-tuned models or LoRAs sharing the same base model without the need for additional training.
Trajectory consistency distillation can be directly placed on top of a pre-trained diffusion model as a [LoRA](https://huggingface.co/docs/diffusers/main/en/training/lora) module. Such a LoRA can be identified as a versatile acceleration module applicable to different fine-tuned models or LoRAs sharing the same base model without the need for additional training.


TCD-LoRAs are available for [stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5), [stable-diffusion-2-1-base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base), and [stable-diffusion-xl-base-1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0).

The corresponding checkpoints can be found at [TCD-SD15](https://huggingface.co/h1t/TCD-SD15-LoRA), [TCD-SD21-base](https://huggingface.co/h1t/TCD-SD21-base-LoRA) and [TCD-SDXL](https://huggingface.co/h1t/TCD-SDXL-LoRA), separately.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The corresponding checkpoints can be found at [TCD-SD15](https://huggingface.co/h1t/TCD-SD15-LoRA), [TCD-SD21-base](https://huggingface.co/h1t/TCD-SD21-base-LoRA) and [TCD-SDXL](https://huggingface.co/h1t/TCD-SDXL-LoRA), separately.
The corresponding checkpoints can be found at [TCD-SD15](https://huggingface.co/h1t/TCD-SD15-LoRA), [TCD-SD21-base](https://huggingface.co/h1t/TCD-SD21-base-LoRA), and [TCD-SDXL](https://huggingface.co/h1t/TCD-SDXL-LoRA), respectively.

- IP-Adapter
- AnimateDiff

TCD-LoRA can be considered an advanced method compared with [LCM-LoRA](https://latent-consistency-models.github.io/). The guide of TCD-LoRA workflow is:
Copy link
Member

@sayakpaul sayakpaul Mar 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TCD-LoRA can be considered an advanced method compared with [LCM-LoRA](https://latent-consistency-models.github.io/). The guide of TCD-LoRA workflow is:
TCD-LoRA can be considered an advanced method compared with [LCM-LoRA](https://huggingface.co/docs/diffusers/main/en/using-diffusers/inference_with_lcm_lora). The main parts of the TCD-LoRA workflow are as follows:

Comment on lines 93 to 97
<Tip>
Eta (referred to as `gamma` in the paper) is used to control the stochasticity in every step.
A value of 0.3 often yields good results, where eta = 0 means determinstic and eta = 1 is identity to Multi-step Consistency Sampler (as well as LCMScheduler).
We recommend using a higher eta when increasing the number of inference steps.
</Tip>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<Tip>
Eta (referred to as `gamma` in the paper) is used to control the stochasticity in every step.
A value of 0.3 often yields good results, where eta = 0 means determinstic and eta = 1 is identity to Multi-step Consistency Sampler (as well as LCMScheduler).
We recommend using a higher eta when increasing the number of inference steps.
</Tip>
<Tip>
Eta (referred to as `gamma` in the paper) is used to control the stochasticity in every step.
A value of 0.3 often yields good results, where eta = 0 means determinstic and eta = 1 is identity to Multi-step Consistency Sampler (as well as LCMScheduler).
We recommend using a higher eta when increasing the number of inference steps.
</Tip>

pipe.load_lora_weights(tcd_lora_id)
pipe.fuse_lora()

prompt = "Beautiful woman, bubblegum pink, lemon yellow, minty blue, futuristic, high-detail, epic composition, watercolor."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we try to make use of non-human characters? Since it's the official documentation, I am slightly concerned about the reception of this.

WDYT? @yiyixuxu @stevhliu

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's replace it with a cute cat.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah works for me. Could you generate the results for those as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For animagine-xl-3.0 and IP-Adapter, the human characters is performing better. Do we need to forcefully replace these?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will let @stevhliu comment further here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be ok for this example 👍


## TCD-LoRA is Versatile for Community Models

As mentioned above, the TCD-LoRA is versatile for community models and plugins. We initially demonstrate the results with a community fine-tuned base model [animagine-xl-3.0](https://huggingface.co/cagliostrolab/animagine-xl-3.0).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
As mentioned above, the TCD-LoRA is versatile for community models and plugins. We initially demonstrate the results with a community fine-tuned base model [animagine-xl-3.0](https://huggingface.co/cagliostrolab/animagine-xl-3.0).
As mentioned above, the TCD-LoRA is versatile for community models and plugins. To test-drive this, load a community fine-tuned base model [animagine-xl-3.0](https://huggingface.co/cagliostrolab/animagine-xl-3.0).


![](https://github.com/jabir-zheng/TCD/raw/main/assets/animagine_xl.png)

Furthermore, TCD-LoRA also support other style LoRA. Here is an example with [Papercut](https://huggingface.co/TheLastBen/Papercut_SDXL). To learn more about how to combine LoRAs, refer to [this guide](https://huggingface.co/docs/diffusers/tutorials/using_peft_for_inference#combine-multiple-adapters).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Furthermore, TCD-LoRA also support other style LoRA. Here is an example with [Papercut](https://huggingface.co/TheLastBen/Papercut_SDXL). To learn more about how to combine LoRAs, refer to [this guide](https://huggingface.co/docs/diffusers/tutorials/using_peft_for_inference#combine-multiple-adapters).
Furthermore, TCD-LoRA also supports LoRAs corresponding to other styles. Below is an example with [Papercut](https://huggingface.co/TheLastBen/Papercut_SDXL). To learn more about how to combine LoRAs, refer to [this guide](https://huggingface.co/docs/diffusers/tutorials/using_peft_for_inference#combine-multiple-adapters).


## Compatibility with ControlNet

For this example, we'll keep using the SDXL model and the TCD-LoRA for SDXL with depth and canny ControlNet.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For this example, we'll keep using the SDXL model and the TCD-LoRA for SDXL with depth and canny ControlNet.
For this example, you'll keep using the SDXL model and the TCD-LoRA for SDXL with depth and canny ControlNets.

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very useful guide. Thanks for writing it in such details!

Let's wait for @stevhliu's reviews before applying suggestions.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super impressive results! My main comments are about how to organize and structure the guide so that it flows better without feeling too repetitive. Great job again! 👍

docs/source/en/using-diffusers/inference_with_tcd_lora.md Outdated Show resolved Hide resolved
docs/source/en/using-diffusers/inference_with_tcd_lora.md Outdated Show resolved Hide resolved
docs/source/en/using-diffusers/inference_with_tcd_lora.md Outdated Show resolved Hide resolved
docs/source/en/using-diffusers/inference_with_tcd_lora.md Outdated Show resolved Hide resolved
docs/source/en/using-diffusers/inference_with_tcd_lora.md Outdated Show resolved Hide resolved
docs/source/en/using-diffusers/inference_with_tcd_lora.md Outdated Show resolved Hide resolved
docs/source/en/using-diffusers/inference_with_tcd_lora.md Outdated Show resolved Hide resolved
docs/source/en/using-diffusers/inference_with_tcd_lora.md Outdated Show resolved Hide resolved
docs/source/en/using-diffusers/inference_with_tcd_lora.md Outdated Show resolved Hide resolved
mhh0318 and others added 3 commits March 12, 2024 11:19
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just need to move the inpainting section and then we should be ready to merge :)

).images[0]
```

![](https://github.com/jabir-zheng/TCD/raw/main/assets/demo_image.png)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
![](https://github.com/jabir-zheng/TCD/raw/main/assets/demo_image.png)
![](https://github.com/jabir-zheng/TCD/raw/main/assets/demo_image.png)
</hfoption>
<hfoption id="inpainting">
move inpainting content here
</hfoption>
</hfoptions>

![](https://github.com/jabir-zheng/TCD/raw/main/assets/styled_lora.png)


## Inpainting with TCD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this section hasn't been moved yet! Take a look at my suggestion above :)

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your great contributions! @stevhliu could you review once and merge?

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks again for your awesome contribution! 🤗

@stevhliu stevhliu merged commit b300517 into huggingface:main Mar 13, 2024
1 check passed
@mhh0318
Copy link
Contributor Author

mhh0318 commented Mar 13, 2024

Thanks for your great efforts and assistance, Steven and Sayak!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants