-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Intro page of TCD #7259
Add Intro page of TCD #7259
Conversation
> ***Better than Teacher:*** TCD maintains superior generative quality at both low NFEs and high NFEs, even exceeding the performance of DPM-Solver++(2S) with origin SDXL. It is worth noting that there is no additional discriminator or LPIPS supervision included during training. | ||
|
||
> ***Flexible NFEs:*** The NFEs for TCD sampling can be varied at will without adversely affecting the quality of the results. | ||
|
||
> ***Freely Change the Detailing:*** During inference, the level of detail in the image can be simply modified by adjusing one hyper-parameter gamma. This option does not require the introduction of any additional parameters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit): Could we maybe make these bullet points?
|
||
From the [Official Project Page](https://mhh0318.github.io/tcd/), the major merit of TCD can be outlined as follows: | ||
|
||
> ***Better than Teacher:*** TCD maintains superior generative quality at both low NFEs and high NFEs, even exceeding the performance of DPM-Solver++(2S) with origin SDXL. It is worth noting that there is no additional discriminator or LPIPS supervision included during training. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reader doesn't yet know what NFE is. Also, it's worth hyperlinking DPM-Solver++(2S).
|
||
From the [Official Project Page](https://mhh0318.github.io/tcd/), the major merit of TCD can be outlined as follows: | ||
|
||
> ***Better than Teacher:*** TCD maintains superior generative quality at both low NFEs and high NFEs, even exceeding the performance of DPM-Solver++(2S) with origin SDXL. It is worth noting that there is no additional discriminator or LPIPS supervision included during training. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
> ***Better than Teacher:*** TCD maintains superior generative quality at both low NFEs and high NFEs, even exceeding the performance of DPM-Solver++(2S) with origin SDXL. It is worth noting that there is no additional discriminator or LPIPS supervision included during training. | |
> ***Better than Teacher:*** TCD maintains superior generative quality at both low NFEs and high NFEs, even exceeding the performance of DPM-Solver++(2S) with Stable Diffusion XL (SDXL). It is worth noting that no additional discriminator or LPIPS supervision is included during training. |
|
||
For more technical details of TCD, please refer to [the paper](https://arxiv.org/abs/2402.19159). | ||
|
||
Trajectory consistency distillation can directly place on top of a pre-trained diffusion model as a LoRA module. Such LoRA can be identified as a versatile acceleration module applicable to different fine-tuned models or LoRAs sharing the same base model without the need for additional training. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trajectory consistency distillation can directly place on top of a pre-trained diffusion model as a LoRA module. Such LoRA can be identified as a versatile acceleration module applicable to different fine-tuned models or LoRAs sharing the same base model without the need for additional training. | |
Trajectory consistency distillation can be directly placed on top of a pre-trained diffusion model as a [LoRA](https://huggingface.co/docs/diffusers/main/en/training/lora) module. Such a LoRA can be identified as a versatile acceleration module applicable to different fine-tuned models or LoRAs sharing the same base model without the need for additional training. |
|
||
TCD-LoRAs are available for [stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5), [stable-diffusion-2-1-base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base), and [stable-diffusion-xl-base-1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0). | ||
|
||
The corresponding checkpoints can be found at [TCD-SD15](https://huggingface.co/h1t/TCD-SD15-LoRA), [TCD-SD21-base](https://huggingface.co/h1t/TCD-SD21-base-LoRA) and [TCD-SDXL](https://huggingface.co/h1t/TCD-SDXL-LoRA), separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The corresponding checkpoints can be found at [TCD-SD15](https://huggingface.co/h1t/TCD-SD15-LoRA), [TCD-SD21-base](https://huggingface.co/h1t/TCD-SD21-base-LoRA) and [TCD-SDXL](https://huggingface.co/h1t/TCD-SDXL-LoRA), separately. | |
The corresponding checkpoints can be found at [TCD-SD15](https://huggingface.co/h1t/TCD-SD15-LoRA), [TCD-SD21-base](https://huggingface.co/h1t/TCD-SD21-base-LoRA), and [TCD-SDXL](https://huggingface.co/h1t/TCD-SDXL-LoRA), respectively. |
- IP-Adapter | ||
- AnimateDiff | ||
|
||
TCD-LoRA can be considered an advanced method compared with [LCM-LoRA](https://latent-consistency-models.github.io/). The guide of TCD-LoRA workflow is: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TCD-LoRA can be considered an advanced method compared with [LCM-LoRA](https://latent-consistency-models.github.io/). The guide of TCD-LoRA workflow is: | |
TCD-LoRA can be considered an advanced method compared with [LCM-LoRA](https://huggingface.co/docs/diffusers/main/en/using-diffusers/inference_with_lcm_lora). The main parts of the TCD-LoRA workflow are as follows: |
<Tip> | ||
Eta (referred to as `gamma` in the paper) is used to control the stochasticity in every step. | ||
A value of 0.3 often yields good results, where eta = 0 means determinstic and eta = 1 is identity to Multi-step Consistency Sampler (as well as LCMScheduler). | ||
We recommend using a higher eta when increasing the number of inference steps. | ||
</Tip> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<Tip> | |
Eta (referred to as `gamma` in the paper) is used to control the stochasticity in every step. | |
A value of 0.3 often yields good results, where eta = 0 means determinstic and eta = 1 is identity to Multi-step Consistency Sampler (as well as LCMScheduler). | |
We recommend using a higher eta when increasing the number of inference steps. | |
</Tip> | |
<Tip> | |
Eta (referred to as `gamma` in the paper) is used to control the stochasticity in every step. | |
A value of 0.3 often yields good results, where eta = 0 means determinstic and eta = 1 is identity to Multi-step Consistency Sampler (as well as LCMScheduler). | |
We recommend using a higher eta when increasing the number of inference steps. | |
</Tip> |
pipe.load_lora_weights(tcd_lora_id) | ||
pipe.fuse_lora() | ||
|
||
prompt = "Beautiful woman, bubblegum pink, lemon yellow, minty blue, futuristic, high-detail, epic composition, watercolor." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's replace it with a cute cat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah works for me. Could you generate the results for those as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For animagine-xl-3.0 and IP-Adapter, the human characters is performing better. Do we need to forcefully replace these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will let @stevhliu comment further here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be ok for this example 👍
|
||
## TCD-LoRA is Versatile for Community Models | ||
|
||
As mentioned above, the TCD-LoRA is versatile for community models and plugins. We initially demonstrate the results with a community fine-tuned base model [animagine-xl-3.0](https://huggingface.co/cagliostrolab/animagine-xl-3.0). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned above, the TCD-LoRA is versatile for community models and plugins. We initially demonstrate the results with a community fine-tuned base model [animagine-xl-3.0](https://huggingface.co/cagliostrolab/animagine-xl-3.0). | |
As mentioned above, the TCD-LoRA is versatile for community models and plugins. To test-drive this, load a community fine-tuned base model [animagine-xl-3.0](https://huggingface.co/cagliostrolab/animagine-xl-3.0). |
|
||
![](https://github.com/jabir-zheng/TCD/raw/main/assets/animagine_xl.png) | ||
|
||
Furthermore, TCD-LoRA also support other style LoRA. Here is an example with [Papercut](https://huggingface.co/TheLastBen/Papercut_SDXL). To learn more about how to combine LoRAs, refer to [this guide](https://huggingface.co/docs/diffusers/tutorials/using_peft_for_inference#combine-multiple-adapters). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Furthermore, TCD-LoRA also support other style LoRA. Here is an example with [Papercut](https://huggingface.co/TheLastBen/Papercut_SDXL). To learn more about how to combine LoRAs, refer to [this guide](https://huggingface.co/docs/diffusers/tutorials/using_peft_for_inference#combine-multiple-adapters). | |
Furthermore, TCD-LoRA also supports LoRAs corresponding to other styles. Below is an example with [Papercut](https://huggingface.co/TheLastBen/Papercut_SDXL). To learn more about how to combine LoRAs, refer to [this guide](https://huggingface.co/docs/diffusers/tutorials/using_peft_for_inference#combine-multiple-adapters). |
|
||
## Compatibility with ControlNet | ||
|
||
For this example, we'll keep using the SDXL model and the TCD-LoRA for SDXL with depth and canny ControlNet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this example, we'll keep using the SDXL model and the TCD-LoRA for SDXL with depth and canny ControlNet. | |
For this example, you'll keep using the SDXL model and the TCD-LoRA for SDXL with depth and canny ControlNets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very useful guide. Thanks for writing it in such details!
Let's wait for @stevhliu's reviews before applying suggestions.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super impressive results! My main comments are about how to organize and structure the guide so that it flows better without feeling too repetitive. Great job again! 👍
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just need to move the inpainting section and then we should be ready to merge :)
).images[0] | ||
``` | ||
|
||
![](https://github.com/jabir-zheng/TCD/raw/main/assets/demo_image.png) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
![](https://github.com/jabir-zheng/TCD/raw/main/assets/demo_image.png) | |
![](https://github.com/jabir-zheng/TCD/raw/main/assets/demo_image.png) | |
</hfoption> | |
<hfoption id="inpainting"> | |
move inpainting content here | |
</hfoption> | |
</hfoptions> |
![](https://github.com/jabir-zheng/TCD/raw/main/assets/styled_lora.png) | ||
|
||
|
||
## Inpainting with TCD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this section hasn't been moved yet! Take a look at my suggestion above :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your great contributions! @stevhliu could you review once and merge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks again for your awesome contribution! 🤗
Thanks for your great efforts and assistance, Steven and Sayak! |
What does this PR do?
Fixes # (issue)
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@sayakpaul
Hi Sayak, I've updated an intro page for TCD. Please have a review.