Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FLUX with diffusers 0.31.0 #1450

Merged
merged 6 commits into from
Nov 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
123 changes: 117 additions & 6 deletions examples/stable-diffusion/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,12 @@ First, you should install the requirements:
pip install -r requirements.txt
```


## Text-to-image Generation

### Single Prompt

Here is how to generate images with one prompt:

```bash
python text_to_image_generation.py \
--model_name_or_path CompVis/stable-diffusion-v1-4 \
Expand All @@ -51,10 +51,10 @@ python text_to_image_generation.py \
> The first batch of images entails a performance penalty. All subsequent batches will be generated much faster.
> You can enable this mode with `--use_hpu_graphs`.


### Multiple Prompts

Here is how to generate images with several prompts:

```bash
python text_to_image_generation.py \
--model_name_or_path CompVis/stable-diffusion-v1-4 \
Expand All @@ -69,7 +69,9 @@ python text_to_image_generation.py \
```

### Distributed inference with multiple HPUs

Here is how to generate images with two prompts on two HPUs:

```bash
python ../gaudi_spawn.py \
--world_size 2 text_to_image_generation.py \
Expand Down Expand Up @@ -109,10 +111,10 @@ python text_to_image_generation.py \
```

> There are two different checkpoints for Stable Diffusion 2:
>
> - use [stabilityai/stable-diffusion-2-1](https://huggingface.co/stabilityai/stable-diffusion-2-1) for generating 768x768 images
> - use [stabilityai/stable-diffusion-2-1-base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) for generating 512x512 images


### Latent Diffusion Model for 3D (LDM3D)

[LDM3D](https://arxiv.org/abs/2305.10853) generates both image and depth map data from a given text prompt, allowing users to generate RGBD images from text prompts.
Expand All @@ -135,7 +137,9 @@ python text_to_image_generation.py \
--ldm3d \
--bf16
```

Here is how to generate images and depth maps with two prompts on two HPUs:

```bash
python ../gaudi_spawn.py \
--world_size 2 text_to_image_generation.py \
Expand All @@ -154,6 +158,7 @@ python ../gaudi_spawn.py \
```

> There are three different checkpoints for LDM3D:
>
> - use [original checkpoint](https://huggingface.co/Intel/ldm3d) to generate outputs from the paper
> - use [the latest checkpoint](https://huggingface.co/Intel/ldm3d-4c) for generating improved results
> - use [the pano checkpoint](https://huggingface.co/Intel/ldm3d-pano) to generate panoramic view
Expand All @@ -163,6 +168,7 @@ python ../gaudi_spawn.py \
Stable Diffusion XL was proposed in [SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis](https://arxiv.org/pdf/2307.01952.pdf) by the Stability AI team.

Here is how to generate SDXL images with a single prompt:

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/stable-diffusion-xl-base-1.0 \
Expand All @@ -182,6 +188,7 @@ python text_to_image_generation.py \
> You can enable this mode with `--use_hpu_graphs`.

Here is how to generate SDXL images with several prompts:

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/stable-diffusion-xl-base-1.0 \
Expand All @@ -199,6 +206,7 @@ python text_to_image_generation.py \
SDXL combines a second text encoder (OpenCLIP ViT-bigG/14) with the original text encoder to significantly
increase the number of parameters. Here is how to generate images with several prompts for both `prompt`
and `prompt_2` (2nd text encoder), as well as their negative prompts:

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/stable-diffusion-xl-base-1.0 \
Expand All @@ -217,6 +225,7 @@ python text_to_image_generation.py \
```

Here is how to generate SDXL images with two prompts on two HPUs:

```bash
python ../gaudi_spawn.py \
--world_size 2 text_to_image_generation.py \
Expand All @@ -235,14 +244,17 @@ python ../gaudi_spawn.py \
--bf16 \
--distributed
```

> HPU graphs are recommended when generating images by batches to get the fastest possible generations.
> The first batch of images entails a performance penalty. All subsequent batches will be generated much faster.
> You can enable this mode with `--use_hpu_graphs`.

### SDXL-Turbo

SDXL-Turbo is a distilled version of SDXL 1.0, trained for real-time synthesis.

Here is how to generate images with multiple prompts:

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/sdxl-turbo \
Expand Down Expand Up @@ -275,11 +287,13 @@ Before running SD3 pipeline, you need to:

1. Agree to the Terms and Conditions for using SD3 model at [HuggingFace model page](https://huggingface.co/stabilityai/stable-diffusion-3-medium)
2. Authenticate with HuggingFace using your HF Token. For authentication, run:

```bash
huggingface-cli login
```

Here is how to generate SD3 images with a single prompt:

```bash
PT_HPU_MAX_COMPOUND_OP_SIZE=1 \
python text_to_image_generation.py \
Expand All @@ -299,12 +313,100 @@ python text_to_image_generation.py \
> For improved performance of the SD3 pipeline on Gaudi, it is recommended to configure the environment
> by setting PT_HPU_MAX_COMPOUND_OP_SIZE to 1.

### FLUX.1

FLUX.1 was introduced by Black Forest Labs [here](https://blackforestlabs.ai/announcing-black-forest-labs/).

Here is how to run FLUX.1-schnell model (fast version of FLUX.1):

```bash
python text_to_image_generation.py \
--model_name_or_path black-forest-labs/FLUX.1-schnell \
--prompts "A cat holding a sign that says hello world" \
--num_images_per_prompt 10 \
--batch_size 1 \
--num_inference_steps 4 \
--image_save_dir /tmp/flux_1_images \
--scheduler flow_match_euler_discrete\
--use_habana \
--use_hpu_graphs \
--gaudi_config Habana/stable-diffusion \
--bf16
```

Before running FLUX.1-dev model, you need to:

1. Agree to the Terms and Conditions for using FLUX.1-dev model at [HuggingFace model page](https://huggingface.co/black-forest-labs/FLUX.1-dev)
2. Authenticate with HuggingFace using your HF Token. For authentication, run:

```bash
huggingface-cli login
```

Here is how to run FLUX.1-dev model:

```bash
python text_to_image_generation.py \
--model_name_or_path black-forest-labs/FLUX.1-dev \
--prompts "A cat holding a sign that says hello world" \
--num_images_per_prompt 10 \
--batch_size 1 \
--num_inference_steps 30 \
--image_save_dir /tmp/flux_1_images \
--scheduler flow_match_euler_discrete\
--use_habana \
--use_hpu_graphs \
--gaudi_config Habana/stable-diffusion \
--bf16
```

This model can also be quantized with some ops running in FP8 precision.

Before quantization, run stats collection using measure mode:

```bash
QUANT_CONFIG=quantization/flux/measure_config.json \
python text_to_image_generation.py \
--model_name_or_path black-forest-labs/FLUX.1-dev \
--prompts "A cat holding a sign that says hello world" \
--num_images_per_prompt 10 \
--batch_size 1 \
--num_inference_steps 30 \
--image_save_dir /tmp/flux_1_images \
--scheduler flow_match_euler_discrete\
--use_habana \
--use_hpu_graphs \
--gaudi_config Habana/stable-diffusion \
--bf16 \
--quant_mode measure
```

After stats collection, here is how to run FLUX.1-dev in quantization mode:

```bash
QUANT_CONFIG=quantization/flux/quantize_config.json \
python text_to_image_generation.py \
--model_name_or_path black-forest-labs/FLUX.1-dev \
--prompts "A cat holding a sign that says hello world" \
--num_images_per_prompt 10 \
--batch_size 1 \
--num_inference_steps 30 \
--image_save_dir /tmp/flux_1_images \
--scheduler flow_match_euler_discrete\
--use_habana \
--use_hpu_graphs \
--gaudi_config Habana/stable-diffusion \
--bf16 \
--quant_mode quantize
```

## ControlNet

ControlNet was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models ](https://huggingface.co/papers/2302.05543) by Lvmin Zhang and Maneesh Agrawala.
ControlNet was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models](https://huggingface.co/papers/2302.05543) by Lvmin Zhang and Maneesh Agrawala.
It is a type of model for controlling StableDiffusion by conditioning the model with an additional input image.

Here is how to generate images conditioned by canny edge model:

```bash
python text_to_image_generation.py \
--model_name_or_path CompVis/stable-diffusion-v1-4 \
Expand All @@ -321,6 +423,7 @@ python text_to_image_generation.py \
```

Here is how to generate images conditioned by canny edge model and with multiple prompts:

```bash
python text_to_image_generation.py \
--model_name_or_path CompVis/stable-diffusion-v1-4 \
Expand All @@ -337,6 +440,7 @@ python text_to_image_generation.py \
```

Here is how to generate images conditioned by canny edge model and with two prompts on two HPUs:

```bash
python ../gaudi_spawn.py \
--world_size 2 text_to_image_generation.py \
Expand All @@ -355,6 +459,7 @@ python ../gaudi_spawn.py \
```

Here is how to generate images conditioned by open pose model:

```bash
python text_to_image_generation.py \
--model_name_or_path CompVis/stable-diffusion-v1-4 \
Expand All @@ -372,6 +477,7 @@ python text_to_image_generation.py \
```

Here is how to generate images with conditioned by canny edge model using Stable Diffusion 2

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/stable-diffusion-2-1 \
Expand All @@ -395,6 +501,7 @@ Inpainting replaces or edits specific areas of an image. For more details,
please refer to [Hugging Face Diffusers doc](https://huggingface.co/docs/diffusers/en/using-diffusers/inpaint).

### Stable Diffusion Inpainting

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/stable-diffusion-2-inpainting \
Expand All @@ -412,6 +519,7 @@ python text_to_image_generation.py \
```

### Stable Diffusion XL Inpainting

```bash
python text_to_image_generation.py \
--model_name_or_path diffusers/stable-diffusion-xl-1.0-inpainting-0.1\
Expand Down Expand Up @@ -457,10 +565,10 @@ python image_to_image_generation.py \
> The first batch of images entails a performance penalty. All subsequent batches will be generated much faster.
> You can enable this mode with `--use_hpu_graphs`.


### Multiple Prompts

Here is how to generate images with several prompts and one image.

```bash
python image_to_image_generation.py \
--model_name_or_path "timbrooks/instruct-pix2pix" \
Expand All @@ -482,10 +590,10 @@ python image_to_image_generation.py \
> The first batch of images entails a performance penalty. All subsequent batches will be generated much faster.
> You can enable this mode with `--use_hpu_graphs`.


### Stable Diffusion XL Refiner

Here is how to generate SDXL images with a single prompt and one image:

```bash
python image_to_image_generation.py \
--model_name_or_path "stabilityai/stable-diffusion-xl-refiner-1.0" \
Expand All @@ -505,6 +613,7 @@ python image_to_image_generation.py \
### Stable Diffusion Image Variations

Here is how to generate images with one image, it does not accept prompt input

```bash
python image_to_image_generation.py \
--model_name_or_path "lambdalabs/sd-image-variations-diffusers" \
Expand Down Expand Up @@ -625,6 +734,7 @@ Script `image_to_video_generation.py` showcases how to perform image-to-video ge
### Single Image Prompt

Here is how to generate video with one image prompt:

```bash
PT_HPU_MAX_COMPOUND_OP_SIZE=1 \
python image_to_video_generation.py \
Expand All @@ -645,6 +755,7 @@ python image_to_video_generation.py \
### Multiple Image Prompts

Here is how to generate videos with several image prompts:

```bash
PT_HPU_MAX_COMPOUND_OP_SIZE=1 \
python image_to_video_generation.py \
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"method": "HOOKS",
"mode": "MEASURE",
"dump_stats_path": "quantization/flux/measure_all/fp8"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"method": "HOOKS",
"mode": "QUANTIZE",
"scale_method": "maxabs_hw_opt_weight",
"dump_stats_path": "quantization/flux/measure_all/fp8"
}
3 changes: 2 additions & 1 deletion examples/stable-diffusion/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
opencv-python
compel
compel
sentencepiece
Loading
Loading