From 118b5e43da4a1d2c97c4342f3c21c9c4cdc11df7 Mon Sep 17 00:00:00 2001 From: apolinario Date: Thu, 20 Oct 2022 10:07:58 +0200 Subject: [PATCH 01/12] Update README.md Additionally add FLAX so the model card can be slimmer and point to this page --- README.md | 113 ++++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 96 insertions(+), 17 deletions(-) diff --git a/README.md b/README.md index fe006f41c2c6..c0bc1910c364 100644 --- a/README.md +++ b/README.md @@ -64,44 +64,54 @@ In order to get started, we recommend taking a look at two notebooks: - The [Training a diffusers model](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb) notebook summarizes diffusion models training methods. This notebook takes a step-by-step approach to training your diffusion models on an image dataset, with explanatory graphics. -## **New** Stable Diffusion is now fully compatible with `diffusers`! +## Stable Diffusion is fully compatible with `diffusers`! -Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/) and [LAION](https://laion.ai/). It's trained on 512x512 images from a subset of the [LAION-5B](https://laion.ai/blog/laion-5b/) database. This model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM. +Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/), [LAION](https://laion.ai/) and [RunwayML](https://runwayml.com/). It's trained on 512x512 images from a subset of the [LAION-5B](https://laion.ai/blog/laion-5b/) database. This model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM. See the [model card](https://huggingface.co/CompVis/stable-diffusion) for more information. -You need to accept the model license before downloading or using the Stable Diffusion weights. Please, visit the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section](https://huggingface.co/docs/hub/security-tokens) of the documentation. +You need to accept the model license before downloading or using the Stable Diffusion weights. Please, visit the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section](https://huggingface.co/docs/hub/security-tokens) of the documentation. ### Text-to-Image generation with Stable Diffusion +First let's install +```bash +pip install --upgrade diffusers transformers scipy +``` + +Run this command to log in with your HF Hub token if you haven't before (you can skip this step if you prefer to run the model locally, follow [this](#running-the-model-locally) instead) +```bash +huggingface-cli login +``` + We recommend using the model in [half-precision (`fp16`)](https://pytorch.org/blog/accelerating-training-on-nvidia-gpus-with-pytorch-automatic-mixed-precision/) as it gives almost always the same results as full precision while being roughly twice as fast and requiring half the amount of GPU RAM. ```python -# make sure you're logged in with `huggingface-cli login` from diffusers import StableDiffusionPipeline -pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", torch_type=torch.float16, revision="fp16") +pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_type=torch.float16, revision="fp16") pipe = pipe.to("cuda") prompt = "a photo of an astronaut riding a horse on mars" image = pipe(prompt).images[0] ``` -**Note**: If you don't want to use the token, you can also simply download the model weights -(after having [accepted the license](https://huggingface.co/CompVis/stable-diffusion-v1-4)) and pass +#### Running the model locally +If you don't want to login to Hugging Face, you can also simply download the model folder +(after having [accepted the license](https://huggingface.co/runwayml/stable-diffusion-v1-5)) and pass the path to the local folder to the `StableDiffusionPipeline`. ``` git lfs install -git clone https://huggingface.co/CompVis/stable-diffusion-v1-4 +git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 ``` -Assuming the folder is stored locally under `./stable-diffusion-v1-4`, you can also run stable diffusion +Assuming the folder is stored locally under `./stable-diffusion-v1-5`, you can also run stable diffusion without requiring an authentication token: ```python -pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-4") +pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-5") pipe = pipe.to("cuda") prompt = "a photo of an astronaut riding a horse on mars" @@ -114,7 +124,7 @@ The following snippet should result in less than 4GB VRAM. ```python pipe = StableDiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16, ) @@ -125,7 +135,7 @@ pipe.enable_attention_slicing() image = pipe(prompt).images[0] ``` -If you wish to use a different scheduler, you can simply instantiate +If you wish to use a different scheduler (e.g.: DDIM, LMS, PNDM/PLMS), you can instantiate it before the pipeline and pass it to `from_pretrained`. ```python @@ -138,7 +148,7 @@ lms = LMSDiscreteScheduler( ) pipe = StableDiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16, scheduler=lms, @@ -158,7 +168,7 @@ please run the model in the default *full-precision* setting: # make sure you're logged in with `huggingface-cli login` from diffusers import StableDiffusionPipeline -pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") +pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5") # disable the following line if you run on CPU pipe = pipe.to("cuda") @@ -169,6 +179,75 @@ image = pipe(prompt).images[0] image.save("astronaut_rides_horse.png") ``` +### JAX/Flax + +To use StableDiffusion on TPUs and GPUs for faster inference you can leverage JAX/Flax. + +Running the pipeline with default PNDMScheduler + +```python +import jax +import numpy as np +from flax.jax_utils import replicate +from flax.training.common_utils import shard + +from diffusers import FlaxStableDiffusionPipeline + +pipeline, params = FlaxStableDiffusionPipeline.from_pretrained( + "runwayml/stable-diffusion-v1-5", revision="flax", dtype=jax.numpy.bfloat16 +) + +prompt = "a photo of an astronaut riding a horse on mars" + +prng_seed = jax.random.PRNGKey(0) +num_inference_steps = 50 + +num_samples = jax.device_count() +prompt = num_samples * [prompt] +prompt_ids = pipeline.prepare_inputs(prompt) + +# shard inputs and rng +params = replicate(params) +prng_seed = jax.random.split(prng_seed, 8) +prompt_ids = shard(prompt_ids) + +images = pipeline(prompt_ids, params, prng_seed, num_inference_steps, jit=True).images +images = pipeline.numpy_to_pil(np.asarray(images.reshape((num_samples,) + images.shape[-3:]))) +``` + +**Note**: +If you are limited by TPU memory, please make sure to load the `FlaxStableDiffusionPipeline` in `bfloat16` precision instead of the default `float32` precision as done above. You can do so by telling diffusers to load the weights from "bf16" branch. + +```python +import jax +import numpy as np +from flax.jax_utils import replicate +from flax.training.common_utils import shard + +from diffusers import FlaxStableDiffusionPipeline + +pipeline, params = FlaxStableDiffusionPipeline.from_pretrained( + "runwayml/stable-diffusion-v1-5", revision="bf16", dtype=jax.numpy.bfloat16 +) + +prompt = "a photo of an astronaut riding a horse on mars" + +prng_seed = jax.random.PRNGKey(0) +num_inference_steps = 50 + +num_samples = jax.device_count() +prompt = num_samples * [prompt] +prompt_ids = pipeline.prepare_inputs(prompt) + +# shard inputs and rng +params = replicate(params) +prng_seed = jax.random.split(prng_seed, 8) +prompt_ids = shard(prompt_ids) + +images = pipeline(prompt_ids, params, prng_seed, num_inference_steps, jit=True).images +images = pipeline.numpy_to_pil(np.asarray(images.reshape((num_samples,) + images.shape[-3:]))) +``` + ### Image-to-Image text-guided generation with Stable Diffusion The `StableDiffusionImg2ImgPipeline` lets you pass a text prompt and an initial image to condition the generation of new images. @@ -183,14 +262,14 @@ from diffusers import StableDiffusionImg2ImgPipeline # load the pipeline device = "cuda" -model_id_or_path = "CompVis/stable-diffusion-v1-4" +model_id_or_path = "runwayml/stable-diffusion-v1-5" pipe = StableDiffusionImg2ImgPipeline.from_pretrained( model_id_or_path, revision="fp16", torch_dtype=torch.float16, ) -# or download via git clone https://huggingface.co/CompVis/stable-diffusion-v1-4 -# and pass `model_id_or_path="./stable-diffusion-v1-4"`. +# or download via git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 +# and pass `model_id_or_path="./stable-diffusion-v1-5"`. pipe = pipe.to(device) # let's download an initial image From 7778c2b088e9e0f388aabba3e1e7e1ef2b0a2700 Mon Sep 17 00:00:00 2001 From: apolinario Date: Thu, 20 Oct 2022 10:13:08 +0200 Subject: [PATCH 02/12] Find and replace all --- docs/source/api/pipelines/overview.mdx | 8 ++--- docs/source/optimization/fp16.mdx | 10 +++--- docs/source/optimization/mps.mdx | 2 +- docs/source/optimization/onnx.mdx | 2 +- docs/source/quicktour.mdx | 10 +++--- docs/source/training/text_inversion.mdx | 4 +-- .../using-diffusers/custom_pipelines.mdx | 2 +- docs/source/using-diffusers/img2img.mdx | 2 +- examples/community/README.md | 10 +++--- .../community/interpolate_stable_diffusion.py | 2 +- examples/community/lpw_stable_diffusion.py | 2 +- examples/community/stable_diffusion_mega.py | 2 +- examples/dreambooth/README.md | 12 +++---- examples/test_examples.py | 2 +- examples/text_to_image/README.md | 6 ++-- examples/textual_inversion/README.md | 4 +-- src/diffusers/modeling_flax_utils.py | 14 ++++---- src/diffusers/pipeline_flax_utils.py | 6 ++-- src/diffusers/pipeline_utils.py | 6 ++-- src/diffusers/pipelines/README.md | 8 ++--- .../pipelines/stable_diffusion/README.md | 12 +++---- .../pipeline_flax_stable_diffusion.py | 2 +- .../pipeline_onnx_stable_diffusion_img2img.py | 2 +- .../pipeline_onnx_stable_diffusion_inpaint.py | 2 +- .../pipeline_stable_diffusion.py | 2 +- .../pipeline_stable_diffusion_img2img.py | 2 +- .../pipeline_stable_diffusion_inpaint.py | 2 +- ...ipeline_stable_diffusion_inpaint_legacy.py | 2 +- tests/test_pipelines.py | 32 +++++++++---------- tests/test_pipelines_flax.py | 8 ++--- 30 files changed, 90 insertions(+), 90 deletions(-) diff --git a/docs/source/api/pipelines/overview.mdx b/docs/source/api/pipelines/overview.mdx index 3082f99665f2..4fadd585ecd4 100644 --- a/docs/source/api/pipelines/overview.mdx +++ b/docs/source/api/pipelines/overview.mdx @@ -67,8 +67,8 @@ Diffusion models often consist of multiple independently-trained models or other Each model has been trained independently on a different task and the scheduler can easily be swapped out and replaced with a different one. During inference, we however want to be able to easily load all components and use them in inference - even if one component, *e.g.* CLIP's text encoder, originates from a different library, such as [Transformers](https://github.com/huggingface/transformers). To that end, all pipelines provide the following functionality: -- [`from_pretrained` method](../diffusion_pipeline) that accepts a Hugging Face Hub repository id, *e.g.* [CompVis/stable-diffusion-v1-4](https://huggingface.co/CompVis/stable-diffusion-v1-4) or a path to a local directory, *e.g.* -"./stable-diffusion". To correctly retrieve which models and components should be loaded, one has to provide a `model_index.json` file, *e.g.* [CompVis/stable-diffusion-v1-4/model_index.json](https://huggingface.co/CompVis/stable-diffusion-v1-4/blob/main/model_index.json), which defines all components that should be +- [`from_pretrained` method](../diffusion_pipeline) that accepts a Hugging Face Hub repository id, *e.g.* [runwayml/stable-diffusion-v-1-5](https://huggingface.co/runwayml/stable-diffusion-v-1-5) or a path to a local directory, *e.g.* +"./stable-diffusion". To correctly retrieve which models and components should be loaded, one has to provide a `model_index.json` file, *e.g.* [runwayml/stable-diffusion-v-1-5/model_index.json](https://huggingface.co/runwayml/stable-diffusion-v-1-5/blob/main/model_index.json), which defines all components that should be loaded into the pipelines. More specifically, for each model/component one needs to define the format `: ["", ""]`. `` is the attribute name given to the loaded instance of `` which can be found in the library or pipeline folder called `""`. - [`save_pretrained`](../diffusion_pipeline) that accepts a local path, *e.g.* `./stable-diffusion` under which all models/components of the pipeline will be saved. For each component/model a folder is created inside the local path that is named after the given attribute name, *e.g.* `./stable_diffusion/unet`. In addition, a `model_index.json` file is created at the root of the local path, *e.g.* `./stable_diffusion/model_index.json` so that the complete pipeline can again be instantiated @@ -100,7 +100,7 @@ logic including pre-processing, an unrolled diffusion loop, and post-processing # make sure you're logged in with `huggingface-cli login` from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler -pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") +pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5") pipe = pipe.to("cuda") prompt = "a photo of an astronaut riding a horse on mars" @@ -123,7 +123,7 @@ from diffusers import StableDiffusionImg2ImgPipeline # load the pipeline device = "cuda" pipe = StableDiffusionImg2ImgPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16 + "runwayml/stable-diffusion-v-1-5", revision="fp16", torch_dtype=torch.float16 ).to(device) # let's download an initial image diff --git a/docs/source/optimization/fp16.mdx b/docs/source/optimization/fp16.mdx index b561aedbfe4a..c7332fa37315 100644 --- a/docs/source/optimization/fp16.mdx +++ b/docs/source/optimization/fp16.mdx @@ -56,7 +56,7 @@ If you use a CUDA GPU, you can take advantage of `torch.autocast` to perform inf from torch import autocast from diffusers import StableDiffusionPipeline -pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") +pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5") pipe = pipe.to("cuda") prompt = "a photo of an astronaut riding a horse on mars" @@ -72,7 +72,7 @@ To save more GPU memory and get even more speed, you can load and run the model ```Python pipe = StableDiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + "runwayml/stable-diffusion-v-1-5", revision="fp16", torch_dtype=torch.float16, ) @@ -97,7 +97,7 @@ import torch from diffusers import StableDiffusionPipeline pipe = StableDiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + "runwayml/stable-diffusion-v-1-5", revision="fp16", torch_dtype=torch.float16, ) @@ -152,7 +152,7 @@ def generate_inputs(): pipe = StableDiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + "runwayml/stable-diffusion-v-1-5", revision="fp16", torch_dtype=torch.float16, ).to("cuda") @@ -216,7 +216,7 @@ class UNet2DConditionOutput: pipe = StableDiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + "runwayml/stable-diffusion-v-1-5", revision="fp16", torch_dtype=torch.float16, ).to("cuda") diff --git a/docs/source/optimization/mps.mdx b/docs/source/optimization/mps.mdx index ff9d614c870f..21be746b69cc 100644 --- a/docs/source/optimization/mps.mdx +++ b/docs/source/optimization/mps.mdx @@ -31,7 +31,7 @@ We recommend to "prime" the pipeline using an additional one-time pass through i # make sure you're logged in with `huggingface-cli login` from diffusers import StableDiffusionPipeline -pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") +pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5") pipe = pipe.to("mps") prompt = "a photo of an astronaut riding a horse on mars" diff --git a/docs/source/optimization/onnx.mdx b/docs/source/optimization/onnx.mdx index 9bbc4f2077c2..c5a55141bc9c 100644 --- a/docs/source/optimization/onnx.mdx +++ b/docs/source/optimization/onnx.mdx @@ -28,7 +28,7 @@ The snippet below demonstrates how to use the ONNX runtime. You need to use `Sta from diffusers import StableDiffusionOnnxPipeline pipe = StableDiffusionOnnxPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + "runwayml/stable-diffusion-v-1-5", revision="onnx", provider="CUDAExecutionProvider", ) diff --git a/docs/source/quicktour.mdx b/docs/source/quicktour.mdx index 9574ecac4a6a..f54c170dd408 100644 --- a/docs/source/quicktour.mdx +++ b/docs/source/quicktour.mdx @@ -68,7 +68,7 @@ You can save the image by simply calling: More advanced models, like [Stable Diffusion](https://huggingface.co/CompVis/stable-diffusion) require you to accept a [license](https://huggingface.co/spaces/CompVis/stable-diffusion-license) before running the model. This is due to the improved image generation capabilities of the model and the potentially harmful content that could be produced with it. -Long story short: Head over to your stable diffusion model of choice, *e.g.* [`CompVis/stable-diffusion-v1-4`](https://huggingface.co/CompVis/stable-diffusion-v1-4), read through the license and click-accept to get +Long story short: Head over to your stable diffusion model of choice, *e.g.* [`runwayml/stable-diffusion-v-1-5`](https://huggingface.co/runwayml/stable-diffusion-v-1-5), read through the license and click-accept to get access to the model. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). Having "click-accepted" the license, you can save your token: @@ -77,13 +77,13 @@ Having "click-accepted" the license, you can save your token: AUTH_TOKEN = "" ``` -You can then load [`CompVis/stable-diffusion-v1-4`](https://huggingface.co/CompVis/stable-diffusion-v1-4) +You can then load [`runwayml/stable-diffusion-v-1-5`](https://huggingface.co/runwayml/stable-diffusion-v-1-5) just like we did before only that now you need to pass your `AUTH_TOKEN`: ```python >>> from diffusers import DiffusionPipeline ->>> generator = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=AUTH_TOKEN) +>>> generator = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5", use_auth_token=AUTH_TOKEN) ``` If you do not pass your authentication token you will see that the diffusion system will not be correctly @@ -95,7 +95,7 @@ the weights locally via: ``` git lfs install -git clone https://huggingface.co/CompVis/stable-diffusion-v1-4 +git clone https://huggingface.co/runwayml/stable-diffusion-v-1-5 ``` and then load locally saved weights into the pipeline. This way, you do not need to pass an authentication @@ -125,7 +125,7 @@ you could use it as follows: >>> scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear") >>> generator = StableDiffusionPipeline.from_pretrained( -... "CompVis/stable-diffusion-v1-4", scheduler=scheduler, use_auth_token=AUTH_TOKEN +... "runwayml/stable-diffusion-v-1-5", scheduler=scheduler, use_auth_token=AUTH_TOKEN ... ) ``` diff --git a/docs/source/training/text_inversion.mdx b/docs/source/training/text_inversion.mdx index 13ea7c942b4e..381f6bc5f7eb 100644 --- a/docs/source/training/text_inversion.mdx +++ b/docs/source/training/text_inversion.mdx @@ -64,7 +64,7 @@ accelerate config ### Cat toy example -You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. +You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v-1-5), read the license and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). @@ -83,7 +83,7 @@ Now let's get our dataset.Download 3-4 images from [here](https://drive.google.c And launch the training using ```bash -export MODEL_NAME="CompVis/stable-diffusion-v1-4" +export MODEL_NAME="runwayml/stable-diffusion-v-1-5" export DATA_DIR="path-to-dir-containing-images" accelerate launch textual_inversion.py \ diff --git a/docs/source/using-diffusers/custom_pipelines.mdx b/docs/source/using-diffusers/custom_pipelines.mdx index b52d405581b1..35253c660143 100644 --- a/docs/source/using-diffusers/custom_pipelines.mdx +++ b/docs/source/using-diffusers/custom_pipelines.mdx @@ -58,7 +58,7 @@ feature_extractor = CLIPFeatureExtractor.from_pretrained(clip_model_id) clip_model = CLIPModel.from_pretrained(clip_model_id) pipeline = DiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + "runwayml/stable-diffusion-v-1-5", custom_pipeline="clip_guided_stable_diffusion", clip_model=clip_model, feature_extractor=feature_extractor, diff --git a/docs/source/using-diffusers/img2img.mdx b/docs/source/using-diffusers/img2img.mdx index 62eaeea911c9..ffff4fffc8db 100644 --- a/docs/source/using-diffusers/img2img.mdx +++ b/docs/source/using-diffusers/img2img.mdx @@ -25,7 +25,7 @@ from diffusers import StableDiffusionImg2ImgPipeline # load the pipeline device = "cuda" pipe = StableDiffusionImg2ImgPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16 + "runwayml/stable-diffusion-v-1-5", revision="fp16", torch_dtype=torch.float16 ).to(device) # let's download an initial image diff --git a/examples/community/README.md b/examples/community/README.md index 54a3093ad3a3..19c516c4f163 100644 --- a/examples/community/README.md +++ b/examples/community/README.md @@ -16,7 +16,7 @@ If a community doesn't work as expected, please open an issue and ping the autho To load a custom pipeline you just need to pass the `custom_pipeline` argument to `DiffusionPipeline`, as one of the files in `diffusers/examples/community`. Feel free to send a PR with your own pipelines, we will merge them quickly. ```py -pipe = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", custom_pipeline="filename_in_the_community_folder") +pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5", custom_pipeline="filename_in_the_community_folder") ``` ## Example usages @@ -39,7 +39,7 @@ clip_model = CLIPModel.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K", guided_pipeline = DiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + "runwayml/stable-diffusion-v-1-5", custom_pipeline="clip_guided_stable_diffusion", clip_model=clip_model, feature_extractor=feature_extractor, @@ -97,7 +97,7 @@ from diffusers import DiffusionPipeline import torch pipe = DiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + "runwayml/stable-diffusion-v-1-5", revision='fp16', torch_dtype=torch.float16, safety_checker=None, # Very important for videos...lots of false positives while interpolating @@ -139,7 +139,7 @@ def download_image(url): response = requests.get(url) return PIL.Image.open(BytesIO(response.content)).convert("RGB") -pipe = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", custom_pipeline="stable_diffusion_mega", torch_dtype=torch.float16, revision="fp16") +pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5", custom_pipeline="stable_diffusion_mega", torch_dtype=torch.float16, revision="fp16") pipe.to("cuda") pipe.enable_attention_slicing() @@ -202,7 +202,7 @@ from diffusers import DiffusionPipeline import torch pipe = DiffusionPipeline.from_pretrained( - 'CompVis/stable-diffusion-v1-4', + 'runwayml/stable-diffusion-v-1-5', custom_pipeline="lpw_stable_diffusion_onnx", revision="onnx", provider="CUDAExecutionProvider" diff --git a/examples/community/interpolate_stable_diffusion.py b/examples/community/interpolate_stable_diffusion.py index 97116bdc77b4..7ea2c736b28a 100644 --- a/examples/community/interpolate_stable_diffusion.py +++ b/examples/community/interpolate_stable_diffusion.py @@ -69,7 +69,7 @@ class StableDiffusionWalkPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/examples/community/lpw_stable_diffusion.py b/examples/community/lpw_stable_diffusion.py index 3d4ec23e3aea..8c6ed5a465f9 100644 --- a/examples/community/lpw_stable_diffusion.py +++ b/examples/community/lpw_stable_diffusion.py @@ -389,7 +389,7 @@ class StableDiffusionLongPromptWeightingPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/examples/community/stable_diffusion_mega.py b/examples/community/stable_diffusion_mega.py index b253370fa2a6..137af1b303cf 100644 --- a/examples/community/stable_diffusion_mega.py +++ b/examples/community/stable_diffusion_mega.py @@ -46,7 +46,7 @@ class StableDiffusionMegaPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionMegaSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/examples/dreambooth/README.md b/examples/dreambooth/README.md index 62ab1b88c0cb..f5d9ee74cd73 100644 --- a/examples/dreambooth/README.md +++ b/examples/dreambooth/README.md @@ -22,7 +22,7 @@ accelerate config ### Dog toy example -You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. +You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v-1-5), read the license and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). @@ -41,7 +41,7 @@ Now let's get our dataset. Download images from [here](https://drive.google.com/ And launch the training using ```bash -export MODEL_NAME="CompVis/stable-diffusion-v1-4" +export MODEL_NAME="runwayml/stable-diffusion-v-1-5" export INSTANCE_DIR="path-to-instance-images" export OUTPUT_DIR="path-to-save-model" @@ -65,7 +65,7 @@ Prior-preservation is used to avoid overfitting and language-drift. Refer to the According to the paper, it's recommended to generate `num_epochs * num_samples` images for prior-preservation. 200-300 works well for most cases. ```bash -export MODEL_NAME="CompVis/stable-diffusion-v1-4" +export MODEL_NAME="runwayml/stable-diffusion-v-1-5" export INSTANCE_DIR="path-to-instance-images" export CLASS_DIR="path-to-class-images" export OUTPUT_DIR="path-to-save-model" @@ -95,7 +95,7 @@ With the help of gradient checkpointing and the 8-bit optimizer from bitsandbyte Install `bitsandbytes` with `pip install bitsandbytes` ```bash -export MODEL_NAME="CompVis/stable-diffusion-v1-4" +export MODEL_NAME="runwayml/stable-diffusion-v-1-5" export INSTANCE_DIR="path-to-instance-images" export CLASS_DIR="path-to-class-images" export OUTPUT_DIR="path-to-save-model" @@ -136,7 +136,7 @@ it requires CUDA toolchain with the same version as pytorch. 8-bit optimizer does not seem to be compatible with DeepSpeed at the moment. ```bash -export MODEL_NAME="CompVis/stable-diffusion-v1-4" +export MODEL_NAME="runwayml/stable-diffusion-v-1-5" export INSTANCE_DIR="path-to-instance-images" export CLASS_DIR="path-to-class-images" export OUTPUT_DIR="path-to-save-model" @@ -168,7 +168,7 @@ Pass the `--train_text_encoder` argument to the script to enable training `text_ ___Note: Training text encoder requires more memory, with this option the training won't fit on 16GB GPU. It needs at least 24GB VRAM.___ ```bash -export MODEL_NAME="CompVis/stable-diffusion-v1-4" +export MODEL_NAME="runwayml/stable-diffusion-v-1-5" export INSTANCE_DIR="path-to-instance-images" export CLASS_DIR="path-to-class-images" export OUTPUT_DIR="path-to-save-model" diff --git a/examples/test_examples.py b/examples/test_examples.py index 15c8e05a5c12..26f613576195 100644 --- a/examples/test_examples.py +++ b/examples/test_examples.py @@ -101,7 +101,7 @@ def test_textual_inversion(self): with tempfile.TemporaryDirectory() as tmpdir: test_args = f""" examples/textual_inversion/textual_inversion.py - --pretrained_model_name_or_path CompVis/stable-diffusion-v1-4 + --pretrained_model_name_or_path runwayml/stable-diffusion-v-1-5 --train_data_dir docs/source/imgs --learnable_property object --placeholder_token diff --git a/examples/text_to_image/README.md b/examples/text_to_image/README.md index 6aca642cda4a..ad22acc65aab 100644 --- a/examples/text_to_image/README.md +++ b/examples/text_to_image/README.md @@ -25,7 +25,7 @@ accelerate config ### Pokemon example -You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. +You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v-1-5), read the license and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). @@ -43,7 +43,7 @@ If you have already cloned the repo, then you won't need to go through these ste With `gradient_checkpointing` and `mixed_precision` it should be possible to fine tune the model on a single 24GB GPU. For higher `batch_size` and faster training it's better to use GPUs with >30GB memory. ```bash -export MODEL_NAME="CompVis/stable-diffusion-v1-4" +export MODEL_NAME="runwayml/stable-diffusion-v-1-5" export dataset_name="lambdalabs/pokemon-blip-captions" accelerate launch train_text_to_image.py \ @@ -67,7 +67,7 @@ To run on your own training files prepare the dataset according to the format re If you wish to use custom loading logic, you should modify the script, we have left pointers for that in the training script. ```bash -export MODEL_NAME="CompVis/stable-diffusion-v1-4" +export MODEL_NAME="runwayml/stable-diffusion-v-1-5" export TRAIN_DIR="path_to_your_dataset" accelerate launch train_text_to_image.py \ diff --git a/examples/textual_inversion/README.md b/examples/textual_inversion/README.md index 05d8ffb8c9f2..b1aee2e4d331 100644 --- a/examples/textual_inversion/README.md +++ b/examples/textual_inversion/README.md @@ -29,7 +29,7 @@ accelerate config ### Cat toy example -You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. +You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v-1-5), read the license and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). @@ -48,7 +48,7 @@ Now let's get our dataset.Download 3-4 images from [here](https://drive.google.c And launch the training using ```bash -export MODEL_NAME="CompVis/stable-diffusion-v1-4" +export MODEL_NAME="runwayml/stable-diffusion-v-1-5" export DATA_DIR="path-to-dir-containing-images" accelerate launch textual_inversion.py \ diff --git a/src/diffusers/modeling_flax_utils.py b/src/diffusers/modeling_flax_utils.py index 6cb30a26f7d5..645e623691d8 100644 --- a/src/diffusers/modeling_flax_utils.py +++ b/src/diffusers/modeling_flax_utils.py @@ -105,14 +105,14 @@ def to_bf16(self, params: Union[Dict, FrozenDict], mask: Any = None): >>> from diffusers import FlaxUNet2DConditionModel >>> # load model - >>> model, params = FlaxUNet2DConditionModel.from_pretrained("CompVis/stable-diffusion-v1-4") + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v-1-5") >>> # By default, the model parameters will be in fp32 precision, to cast these to bfloat16 precision >>> params = model.to_bf16(params) >>> # If you don't want to cast certain parameters (for example layer norm bias and scale) >>> # then pass the mask as follows >>> from flax import traverse_util - >>> model, params = FlaxUNet2DConditionModel.from_pretrained("CompVis/stable-diffusion-v1-4") + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v-1-5") >>> flat_params = traverse_util.flatten_dict(params) >>> mask = { ... path: (path[-2] != ("LayerNorm", "bias") and path[-2:] != ("LayerNorm", "scale")) @@ -141,7 +141,7 @@ def to_fp32(self, params: Union[Dict, FrozenDict], mask: Any = None): >>> from diffusers import FlaxUNet2DConditionModel >>> # Download model and configuration from huggingface.co - >>> model, params = FlaxUNet2DConditionModel.from_pretrained("CompVis/stable-diffusion-v1-4") + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v-1-5") >>> # By default, the model params will be in fp32, to illustrate the use of this method, >>> # we'll first cast to fp16 and back to fp32 >>> params = model.to_f16(params) @@ -171,14 +171,14 @@ def to_fp16(self, params: Union[Dict, FrozenDict], mask: Any = None): >>> from diffusers import FlaxUNet2DConditionModel >>> # load model - >>> model, params = FlaxUNet2DConditionModel.from_pretrained("CompVis/stable-diffusion-v1-4") + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v-1-5") >>> # By default, the model params will be in fp32, to cast these to float16 >>> params = model.to_fp16(params) >>> # If you want don't want to cast certain parameters (for example layer norm bias and scale) >>> # then pass the mask as follows >>> from flax import traverse_util - >>> model, params = FlaxUNet2DConditionModel.from_pretrained("CompVis/stable-diffusion-v1-4") + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v-1-5") >>> flat_params = traverse_util.flatten_dict(params) >>> mask = { ... path: (path[-2] != ("LayerNorm", "bias") and path[-2:] != ("LayerNorm", "scale")) @@ -216,7 +216,7 @@ def from_pretrained( - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids are namespaced under a user or organization name, like - `CompVis/stable-diffusion-v1-4`. + `runwayml/stable-diffusion-v-1-5`. - A path to a *directory* containing model weights saved using [`~ModelMixin.save_pretrained`], e.g., `./my_model_directory/`. dtype (`jax.numpy.dtype`, *optional*, defaults to `jax.numpy.float32`): @@ -273,7 +273,7 @@ def from_pretrained( >>> from diffusers import FlaxUNet2DConditionModel >>> # Download model and configuration from huggingface.co and cache. - >>> model, params = FlaxUNet2DConditionModel.from_pretrained("CompVis/stable-diffusion-v1-4") + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v-1-5") >>> # Model was saved using *save_pretrained('./test/saved_model/')* (for example purposes, not runnable). >>> model, params = FlaxUNet2DConditionModel.from_pretrained("./test/saved_model/") ```""" diff --git a/src/diffusers/pipeline_flax_utils.py b/src/diffusers/pipeline_flax_utils.py index d55338b50343..fee3768ec919 100644 --- a/src/diffusers/pipeline_flax_utils.py +++ b/src/diffusers/pipeline_flax_utils.py @@ -244,7 +244,7 @@ def from_pretrained(cls, pretrained_model_name_or_path: Optional[Union[str, os.P It is required to be logged in (`huggingface-cli login`) when you want to use private or [gated - models](https://huggingface.co/docs/hub/models-gated#gated-models), *e.g.* `"CompVis/stable-diffusion-v1-4"` + models](https://huggingface.co/docs/hub/models-gated#gated-models), *e.g.* `"runwayml/stable-diffusion-v-1-5"` @@ -266,13 +266,13 @@ def from_pretrained(cls, pretrained_model_name_or_path: Optional[Union[str, os.P >>> # Download pipeline that requires an authorization token >>> # For more information on access tokens, please refer to this section >>> # of the documentation](https://huggingface.co/docs/hub/security-tokens) - >>> pipeline = FlaxDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") + >>> pipeline = FlaxDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5") >>> # Download pipeline, but overwrite scheduler >>> from diffusers import LMSDiscreteScheduler >>> scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear") - >>> pipeline = FlaxDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", scheduler=scheduler) + >>> pipeline = FlaxDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5", scheduler=scheduler) ``` """ cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) diff --git a/src/diffusers/pipeline_utils.py b/src/diffusers/pipeline_utils.py index 37bc228b10e6..26a49952706b 100644 --- a/src/diffusers/pipeline_utils.py +++ b/src/diffusers/pipeline_utils.py @@ -317,7 +317,7 @@ def from_pretrained(cls, pretrained_model_name_or_path: Optional[Union[str, os.P It is required to be logged in (`huggingface-cli login`) when you want to use private or [gated - models](https://huggingface.co/docs/hub/models-gated#gated-models), *e.g.* `"CompVis/stable-diffusion-v1-4"` + models](https://huggingface.co/docs/hub/models-gated#gated-models), *e.g.* `"runwayml/stable-diffusion-v-1-5"` @@ -339,13 +339,13 @@ def from_pretrained(cls, pretrained_model_name_or_path: Optional[Union[str, os.P >>> # Download pipeline that requires an authorization token >>> # For more information on access tokens, please refer to this section >>> # of the documentation](https://huggingface.co/docs/hub/security-tokens) - >>> pipeline = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") + >>> pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5") >>> # Download pipeline, but overwrite scheduler >>> from diffusers import LMSDiscreteScheduler >>> scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear") - >>> pipeline = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", scheduler=scheduler) + >>> pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5", scheduler=scheduler) ``` """ cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) diff --git a/src/diffusers/pipelines/README.md b/src/diffusers/pipelines/README.md index 90752d69f25f..9d1129d13ef6 100644 --- a/src/diffusers/pipelines/README.md +++ b/src/diffusers/pipelines/README.md @@ -55,8 +55,8 @@ Diffusion models often consist of multiple independently-trained models or other Each model has been trained independently on a different task and the scheduler can easily be swapped out and replaced with a different one. During inference, we however want to be able to easily load all components and use them in inference - even if one component, *e.g.* CLIP's text encoder, originates from a different library, such as [Transformers](https://github.com/huggingface/transformers). To that end, all pipelines provide the following functionality: -- [`from_pretrained` method](https://github.com/huggingface/diffusers/blob/5cbed8e0d157f65d3ddc2420dfd09f2df630e978/src/diffusers/pipeline_utils.py#L139) that accepts a Hugging Face Hub repository id, *e.g.* [CompVis/stable-diffusion-v1-4](https://huggingface.co/CompVis/stable-diffusion-v1-4) or a path to a local directory, *e.g.* -"./stable-diffusion". To correctly retrieve which models and components should be loaded, one has to provide a `model_index.json` file, *e.g.* [CompVis/stable-diffusion-v1-4/model_index.json](https://huggingface.co/CompVis/stable-diffusion-v1-4/blob/main/model_index.json), which defines all components that should be +- [`from_pretrained` method](https://github.com/huggingface/diffusers/blob/5cbed8e0d157f65d3ddc2420dfd09f2df630e978/src/diffusers/pipeline_utils.py#L139) that accepts a Hugging Face Hub repository id, *e.g.* [runwayml/stable-diffusion-v-1-5](https://huggingface.co/runwayml/stable-diffusion-v-1-5) or a path to a local directory, *e.g.* +"./stable-diffusion". To correctly retrieve which models and components should be loaded, one has to provide a `model_index.json` file, *e.g.* [runwayml/stable-diffusion-v-1-5/model_index.json](https://huggingface.co/runwayml/stable-diffusion-v-1-5/blob/main/model_index.json), which defines all components that should be loaded into the pipelines. More specifically, for each model/component one needs to define the format `: ["", ""]`. `` is the attribute name given to the loaded instance of `` which can be found in the library or pipeline folder called `""`. - [`save_pretrained`](https://github.com/huggingface/diffusers/blob/5cbed8e0d157f65d3ddc2420dfd09f2df630e978/src/diffusers/pipeline_utils.py#L90) that accepts a local path, *e.g.* `./stable-diffusion` under which all models/components of the pipeline will be saved. For each component/model a folder is created inside the local path that is named after the given attribute name, *e.g.* `./stable_diffusion/unet`. In addition, a `model_index.json` file is created at the root of the local path, *e.g.* `./stable_diffusion/model_index.json` so that the complete pipeline can again be instantiated @@ -88,7 +88,7 @@ logic including pre-processing, an unrolled diffusion loop, and post-processing # make sure you're logged in with `huggingface-cli login` from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler -pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") +pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5") pipe = pipe.to("cuda") prompt = "a photo of an astronaut riding a horse on mars" @@ -111,7 +111,7 @@ from diffusers import StableDiffusionImg2ImgPipeline # load the pipeline device = "cuda" pipe = StableDiffusionImg2ImgPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + "runwayml/stable-diffusion-v-1-5", revision="fp16", torch_dtype=torch.float16, ).to(device) diff --git a/src/diffusers/pipelines/stable_diffusion/README.md b/src/diffusers/pipelines/stable_diffusion/README.md index 47c38acbdb35..9cfd82c629fe 100644 --- a/src/diffusers/pipelines/stable_diffusion/README.md +++ b/src/diffusers/pipelines/stable_diffusion/README.md @@ -13,7 +13,7 @@ The summary of the model is the following: - Stable Diffusion has the same architecture as [Latent Diffusion](https://arxiv.org/abs/2112.10752) but uses a frozen CLIP Text Encoder instead of training the text encoder jointly with the diffusion model. - An in-detail explanation of the Stable Diffusion model can be found under [Stable Diffusion with ๐Ÿงจ Diffusers](https://huggingface.co/blog/stable_diffusion). - If you don't want to rely on the Hugging Face Hub and having to pass a authentication token, you can -download the weights with `git lfs install; git clone https://huggingface.co/CompVis/stable-diffusion-v1-4` and instead pass the local path to the cloned folder to `from_pretrained` as shown below. +download the weights with `git lfs install; git clone https://huggingface.co/runwayml/stable-diffusion-v-1-5` and instead pass the local path to the cloned folder to `from_pretrained` as shown below. - Stable Diffusion can work with a variety of different samplers as is shown below. ## Available Pipelines: @@ -33,14 +33,14 @@ If you want to download the model weights using a single Python line, you need t ```python from diffusers import DiffusionPipeline -pipeline = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") +pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5") ``` This however can make it difficult to build applications on top of `diffusers` as you will always have to pass the token around. A potential way to solve this issue is by downloading the weights to a local path `"./stable-diffusion-v1-4"`: ``` git lfs install -git clone https://huggingface.co/CompVis/stable-diffusion-v1-4 +git clone https://huggingface.co/runwayml/stable-diffusion-v-1-5 ``` and simply passing the local path to `from_pretrained`: @@ -57,7 +57,7 @@ pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-4") # make sure you're logged in with `huggingface-cli login` from diffusers import StableDiffusionPipeline -pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") +pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5") pipe = pipe.to("cuda") prompt = "a photo of an astronaut riding a horse on mars" @@ -75,7 +75,7 @@ from diffusers import StableDiffusionPipeline, DDIMScheduler scheduler = DDIMScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", clip_sample=False, set_alpha_to_one=False) pipe = StableDiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + "runwayml/stable-diffusion-v-1-5", scheduler=scheduler, ).to("cuda") @@ -98,7 +98,7 @@ lms = LMSDiscreteScheduler( ) pipe = StableDiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + "runwayml/stable-diffusion-v-1-5", scheduler=lms, ).to("cuda") diff --git a/src/diffusers/pipelines/stable_diffusion/pipeline_flax_stable_diffusion.py b/src/diffusers/pipelines/stable_diffusion/pipeline_flax_stable_diffusion.py index 18c008f8806b..fb7c91c01342 100644 --- a/src/diffusers/pipelines/stable_diffusion/pipeline_flax_stable_diffusion.py +++ b/src/diffusers/pipelines/stable_diffusion/pipeline_flax_stable_diffusion.py @@ -45,7 +45,7 @@ class FlaxStableDiffusionPipeline(FlaxDiffusionPipeline): [`FlaxDDIMScheduler`], [`FlaxLMSDiscreteScheduler`], or [`FlaxPNDMScheduler`]. safety_checker ([`FlaxStableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_img2img.py b/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_img2img.py index 89cc054a424e..99783fc19b0d 100644 --- a/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_img2img.py +++ b/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_img2img.py @@ -50,7 +50,7 @@ class OnnxStableDiffusionImg2ImgPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_inpaint.py b/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_inpaint.py index 30f8d7fcc3b8..3c62bff67c0d 100644 --- a/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_inpaint.py +++ b/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_inpaint.py @@ -63,7 +63,7 @@ class OnnxStableDiffusionInpaintPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py index 8ae51999a7b3..eeb9c909b8b0 100644 --- a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py +++ b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py @@ -40,7 +40,7 @@ class StableDiffusionPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py index 799fd459bbd9..06705d64ead1 100644 --- a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py +++ b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py @@ -52,7 +52,7 @@ class StableDiffusionImg2ImgPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py index abcc7fba6e8a..214118df5811 100644 --- a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py +++ b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py @@ -59,7 +59,7 @@ class StableDiffusionInpaintPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint_legacy.py b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint_legacy.py index 038178dd9a0a..0d3b8c19ac28 100644 --- a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint_legacy.py +++ b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint_legacy.py @@ -66,7 +66,7 @@ class StableDiffusionInpaintPipelineLegacy(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/tests/test_pipelines.py b/tests/test_pipelines.py index 357fb11a0e8c..b4c4885cacb1 100644 --- a/tests/test_pipelines.py +++ b/tests/test_pipelines.py @@ -127,7 +127,7 @@ def test_load_pipeline_from_git(self): clip_model = CLIPModel.from_pretrained(clip_model_id, torch_dtype=torch.float16) pipeline = DiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + "runwayml/stable-diffusion-v-1-5", custom_pipeline="clip_guided_stable_diffusion", clip_model=clip_model, feature_extractor=feature_extractor, @@ -1819,7 +1819,7 @@ def test_lms_stable_diffusion_pipeline(self): @unittest.skipIf(torch_device == "cpu", "Stable diffusion is supposed to run on GPU") def test_stable_diffusion_memory_chunking(self): torch.cuda.reset_peak_memory_stats() - model_id = "CompVis/stable-diffusion-v1-4" + model_id = "runwayml/stable-diffusion-v-1-5" pipe = StableDiffusionPipeline.from_pretrained(model_id, revision="fp16", torch_dtype=torch.float16).to( torch_device ) @@ -1859,7 +1859,7 @@ def test_stable_diffusion_memory_chunking(self): @unittest.skipIf(torch_device == "cpu", "Stable diffusion is supposed to run on GPU") def test_stable_diffusion_text2img_pipeline_fp16(self): torch.cuda.reset_peak_memory_stats() - model_id = "CompVis/stable-diffusion-v1-4" + model_id = "runwayml/stable-diffusion-v-1-5" pipe = StableDiffusionPipeline.from_pretrained(model_id, revision="fp16", torch_dtype=torch.float16).to( torch_device ) @@ -1895,7 +1895,7 @@ def test_stable_diffusion_text2img_pipeline(self): ) expected_image = np.array(expected_image, dtype=np.float32) / 255.0 - model_id = "CompVis/stable-diffusion-v1-4" + model_id = "runwayml/stable-diffusion-v-1-5" pipe = StableDiffusionPipeline.from_pretrained( model_id, safety_checker=self.dummy_safety_checker, @@ -1927,7 +1927,7 @@ def test_stable_diffusion_img2img_pipeline(self): init_image = init_image.resize((768, 512)) expected_image = np.array(expected_image, dtype=np.float32) / 255.0 - model_id = "CompVis/stable-diffusion-v1-4" + model_id = "runwayml/stable-diffusion-v-1-5" pipe = StableDiffusionImg2ImgPipeline.from_pretrained( model_id, safety_checker=self.dummy_safety_checker, @@ -1969,7 +1969,7 @@ def test_stable_diffusion_img2img_pipeline_k_lms(self): lms = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear") - model_id = "CompVis/stable-diffusion-v1-4" + model_id = "runwayml/stable-diffusion-v-1-5" pipe = StableDiffusionImg2ImgPipeline.from_pretrained( model_id, scheduler=lms, @@ -2097,7 +2097,7 @@ def test_stable_diffusion_inpaint_legacy_pipeline(self): ) expected_image = np.array(expected_image, dtype=np.float32) / 255.0 - model_id = "CompVis/stable-diffusion-v1-4" + model_id = "runwayml/stable-diffusion-v-1-5" pipe = StableDiffusionInpaintPipeline.from_pretrained( model_id, safety_checker=self.dummy_safety_checker, @@ -2184,7 +2184,7 @@ def test_stable_diffusion_inpaint_legacy_pipeline_k_lms(self): lms = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear") - model_id = "CompVis/stable-diffusion-v1-4" + model_id = "runwayml/stable-diffusion-v-1-5" pipe = StableDiffusionInpaintPipeline.from_pretrained( model_id, scheduler=lms, @@ -2214,7 +2214,7 @@ def test_stable_diffusion_inpaint_legacy_pipeline_k_lms(self): @slow def test_stable_diffusion_onnx(self): sd_pipe = OnnxStableDiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", revision="onnx", provider="CPUExecutionProvider" + "runwayml/stable-diffusion-v-1-5", revision="onnx", provider="CPUExecutionProvider" ) prompt = "A painting of a squirrel eating a burger" @@ -2236,7 +2236,7 @@ def test_stable_diffusion_img2img_onnx(self): ) init_image = init_image.resize((768, 512)) pipe = OnnxStableDiffusionImg2ImgPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", revision="onnx", provider="CPUExecutionProvider" + "runwayml/stable-diffusion-v-1-5", revision="onnx", provider="CPUExecutionProvider" ) pipe.set_progress_bar_config(disable=None) @@ -2322,7 +2322,7 @@ def test_callback_fn(step: int, timestep: int, latents: torch.FloatTensor) -> No test_callback_fn.has_been_called = False pipe = StableDiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16 + "runwayml/stable-diffusion-v-1-5", revision="fp16", torch_dtype=torch.float16 ) pipe = pipe.to(torch_device) pipe.set_progress_bar_config(disable=None) @@ -2374,7 +2374,7 @@ def test_callback_fn(step: int, timestep: int, latents: torch.FloatTensor) -> No init_image = init_image.resize((768, 512)) pipe = StableDiffusionImg2ImgPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16 + "runwayml/stable-diffusion-v-1-5", revision="fp16", torch_dtype=torch.float16 ) pipe.to(torch_device) pipe.set_progress_bar_config(disable=None) @@ -2433,7 +2433,7 @@ def test_callback_fn(step: int, timestep: int, latents: torch.FloatTensor) -> No ) pipe = StableDiffusionInpaintPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16 + "runwayml/stable-diffusion-v-1-5", revision="fp16", torch_dtype=torch.float16 ) pipe.to(torch_device) pipe.set_progress_bar_config(disable=None) @@ -2483,7 +2483,7 @@ def test_callback_fn(step: int, timestep: int, latents: np.ndarray) -> None: test_callback_fn.has_been_called = False pipe = OnnxStableDiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", revision="onnx", provider="CPUExecutionProvider" + "runwayml/stable-diffusion-v-1-5", revision="onnx", provider="CPUExecutionProvider" ) pipe.set_progress_bar_config(disable=None) @@ -2503,7 +2503,7 @@ def test_stable_diffusion_accelerate_load_works(self): if version.parse(version.parse(accelerate.__version__).base_version) < version.parse("0.14"): return - model_id = "CompVis/stable-diffusion-v1-4" + model_id = "runwayml/stable-diffusion-v-1-5" _ = StableDiffusionPipeline.from_pretrained( model_id, revision="fp16", torch_dtype=torch.float16, use_auth_token=True, device_map="auto" ).to(torch_device) @@ -2517,7 +2517,7 @@ def test_stable_diffusion_accelerate_load_reduces_memory_footprint(self): if version.parse(version.parse(accelerate.__version__).base_version) < version.parse("0.14"): return - pipeline_id = "CompVis/stable-diffusion-v1-4" + pipeline_id = "runwayml/stable-diffusion-v-1-5" torch.cuda.empty_cache() gc.collect() diff --git a/tests/test_pipelines_flax.py b/tests/test_pipelines_flax.py index 9256944815c7..d269c673ac31 100644 --- a/tests/test_pipelines_flax.py +++ b/tests/test_pipelines_flax.py @@ -69,7 +69,7 @@ def test_dummy_all_tpus(self): def test_stable_diffusion_v1_4(self): pipeline, params = FlaxStableDiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", revision="flax", safety_checker=None + "runwayml/stable-diffusion-v-1-5", revision="flax", safety_checker=None ) prompt = ( @@ -99,7 +99,7 @@ def test_stable_diffusion_v1_4(self): def test_stable_diffusion_v1_4_bfloat_16(self): pipeline, params = FlaxStableDiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", revision="bf16", dtype=jnp.bfloat16, safety_checker=None + "runwayml/stable-diffusion-v-1-5", revision="bf16", dtype=jnp.bfloat16, safety_checker=None ) prompt = ( @@ -129,7 +129,7 @@ def test_stable_diffusion_v1_4_bfloat_16(self): def test_stable_diffusion_v1_4_bfloat_16_with_safety(self): pipeline, params = FlaxStableDiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", revision="bf16", dtype=jnp.bfloat16 + "runwayml/stable-diffusion-v-1-5", revision="bf16", dtype=jnp.bfloat16 ) prompt = ( @@ -165,7 +165,7 @@ def test_stable_diffusion_v1_4_bfloat_16_ddim(self): ) pipeline, params = FlaxStableDiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + "runwayml/stable-diffusion-v-1-5", revision="bf16", dtype=jnp.bfloat16, scheduler=scheduler, From 355746f0f6d14900767e2adc0e867633155cc3d7 Mon Sep 17 00:00:00 2001 From: anton-l Date: Thu, 20 Oct 2022 12:22:15 +0200 Subject: [PATCH 03/12] v-1-5 -> v1-5 --- docs/source/api/pipelines/overview.mdx | 8 ++--- docs/source/optimization/fp16.mdx | 10 +++--- docs/source/optimization/mps.mdx | 2 +- docs/source/optimization/onnx.mdx | 2 +- docs/source/quicktour.mdx | 10 +++--- docs/source/training/text_inversion.mdx | 4 +-- .../using-diffusers/custom_pipelines.mdx | 2 +- docs/source/using-diffusers/img2img.mdx | 2 +- examples/community/README.md | 10 +++--- .../community/interpolate_stable_diffusion.py | 2 +- examples/community/lpw_stable_diffusion.py | 2 +- examples/community/stable_diffusion_mega.py | 2 +- examples/dreambooth/README.md | 12 +++---- examples/test_examples.py | 2 +- examples/text_to_image/README.md | 6 ++-- examples/textual_inversion/README.md | 4 +-- src/diffusers/modeling_flax_utils.py | 14 ++++---- src/diffusers/pipeline_flax_utils.py | 6 ++-- src/diffusers/pipeline_utils.py | 6 ++-- src/diffusers/pipelines/README.md | 8 ++--- .../pipelines/stable_diffusion/README.md | 12 +++---- .../pipeline_flax_stable_diffusion.py | 2 +- .../pipeline_onnx_stable_diffusion_img2img.py | 2 +- .../pipeline_onnx_stable_diffusion_inpaint.py | 2 +- .../pipeline_stable_diffusion.py | 2 +- .../pipeline_stable_diffusion_img2img.py | 2 +- .../pipeline_stable_diffusion_inpaint.py | 2 +- ...ipeline_stable_diffusion_inpaint_legacy.py | 2 +- tests/test_pipelines.py | 32 +++++++++---------- tests/test_pipelines_flax.py | 8 ++--- 30 files changed, 90 insertions(+), 90 deletions(-) diff --git a/docs/source/api/pipelines/overview.mdx b/docs/source/api/pipelines/overview.mdx index 4fadd585ecd4..af711a02d9f3 100644 --- a/docs/source/api/pipelines/overview.mdx +++ b/docs/source/api/pipelines/overview.mdx @@ -67,8 +67,8 @@ Diffusion models often consist of multiple independently-trained models or other Each model has been trained independently on a different task and the scheduler can easily be swapped out and replaced with a different one. During inference, we however want to be able to easily load all components and use them in inference - even if one component, *e.g.* CLIP's text encoder, originates from a different library, such as [Transformers](https://github.com/huggingface/transformers). To that end, all pipelines provide the following functionality: -- [`from_pretrained` method](../diffusion_pipeline) that accepts a Hugging Face Hub repository id, *e.g.* [runwayml/stable-diffusion-v-1-5](https://huggingface.co/runwayml/stable-diffusion-v-1-5) or a path to a local directory, *e.g.* -"./stable-diffusion". To correctly retrieve which models and components should be loaded, one has to provide a `model_index.json` file, *e.g.* [runwayml/stable-diffusion-v-1-5/model_index.json](https://huggingface.co/runwayml/stable-diffusion-v-1-5/blob/main/model_index.json), which defines all components that should be +- [`from_pretrained` method](../diffusion_pipeline) that accepts a Hugging Face Hub repository id, *e.g.* [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5) or a path to a local directory, *e.g.* +"./stable-diffusion". To correctly retrieve which models and components should be loaded, one has to provide a `model_index.json` file, *e.g.* [runwayml/stable-diffusion-v1-5/model_index.json](https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/model_index.json), which defines all components that should be loaded into the pipelines. More specifically, for each model/component one needs to define the format `: ["", ""]`. `` is the attribute name given to the loaded instance of `` which can be found in the library or pipeline folder called `""`. - [`save_pretrained`](../diffusion_pipeline) that accepts a local path, *e.g.* `./stable-diffusion` under which all models/components of the pipeline will be saved. For each component/model a folder is created inside the local path that is named after the given attribute name, *e.g.* `./stable_diffusion/unet`. In addition, a `model_index.json` file is created at the root of the local path, *e.g.* `./stable_diffusion/model_index.json` so that the complete pipeline can again be instantiated @@ -100,7 +100,7 @@ logic including pre-processing, an unrolled diffusion loop, and post-processing # make sure you're logged in with `huggingface-cli login` from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler -pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5") +pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5") pipe = pipe.to("cuda") prompt = "a photo of an astronaut riding a horse on mars" @@ -123,7 +123,7 @@ from diffusers import StableDiffusionImg2ImgPipeline # load the pipeline device = "cuda" pipe = StableDiffusionImg2ImgPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", revision="fp16", torch_dtype=torch.float16 + "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16 ).to(device) # let's download an initial image diff --git a/docs/source/optimization/fp16.mdx b/docs/source/optimization/fp16.mdx index c7332fa37315..d1dd87f7652f 100644 --- a/docs/source/optimization/fp16.mdx +++ b/docs/source/optimization/fp16.mdx @@ -56,7 +56,7 @@ If you use a CUDA GPU, you can take advantage of `torch.autocast` to perform inf from torch import autocast from diffusers import StableDiffusionPipeline -pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5") +pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5") pipe = pipe.to("cuda") prompt = "a photo of an astronaut riding a horse on mars" @@ -72,7 +72,7 @@ To save more GPU memory and get even more speed, you can load and run the model ```Python pipe = StableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", + "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16, ) @@ -97,7 +97,7 @@ import torch from diffusers import StableDiffusionPipeline pipe = StableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", + "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16, ) @@ -152,7 +152,7 @@ def generate_inputs(): pipe = StableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", + "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16, ).to("cuda") @@ -216,7 +216,7 @@ class UNet2DConditionOutput: pipe = StableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", + "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16, ).to("cuda") diff --git a/docs/source/optimization/mps.mdx b/docs/source/optimization/mps.mdx index 21be746b69cc..754adae5d9b2 100644 --- a/docs/source/optimization/mps.mdx +++ b/docs/source/optimization/mps.mdx @@ -31,7 +31,7 @@ We recommend to "prime" the pipeline using an additional one-time pass through i # make sure you're logged in with `huggingface-cli login` from diffusers import StableDiffusionPipeline -pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5") +pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5") pipe = pipe.to("mps") prompt = "a photo of an astronaut riding a horse on mars" diff --git a/docs/source/optimization/onnx.mdx b/docs/source/optimization/onnx.mdx index c5a55141bc9c..e79efbde0742 100644 --- a/docs/source/optimization/onnx.mdx +++ b/docs/source/optimization/onnx.mdx @@ -28,7 +28,7 @@ The snippet below demonstrates how to use the ONNX runtime. You need to use `Sta from diffusers import StableDiffusionOnnxPipeline pipe = StableDiffusionOnnxPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", + "runwayml/stable-diffusion-v1-5", revision="onnx", provider="CUDAExecutionProvider", ) diff --git a/docs/source/quicktour.mdx b/docs/source/quicktour.mdx index f54c170dd408..f17745899c55 100644 --- a/docs/source/quicktour.mdx +++ b/docs/source/quicktour.mdx @@ -68,7 +68,7 @@ You can save the image by simply calling: More advanced models, like [Stable Diffusion](https://huggingface.co/CompVis/stable-diffusion) require you to accept a [license](https://huggingface.co/spaces/CompVis/stable-diffusion-license) before running the model. This is due to the improved image generation capabilities of the model and the potentially harmful content that could be produced with it. -Long story short: Head over to your stable diffusion model of choice, *e.g.* [`runwayml/stable-diffusion-v-1-5`](https://huggingface.co/runwayml/stable-diffusion-v-1-5), read through the license and click-accept to get +Long story short: Head over to your stable diffusion model of choice, *e.g.* [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5), read through the license and click-accept to get access to the model. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). Having "click-accepted" the license, you can save your token: @@ -77,13 +77,13 @@ Having "click-accepted" the license, you can save your token: AUTH_TOKEN = "" ``` -You can then load [`runwayml/stable-diffusion-v-1-5`](https://huggingface.co/runwayml/stable-diffusion-v-1-5) +You can then load [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5) just like we did before only that now you need to pass your `AUTH_TOKEN`: ```python >>> from diffusers import DiffusionPipeline ->>> generator = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5", use_auth_token=AUTH_TOKEN) +>>> generator = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", use_auth_token=AUTH_TOKEN) ``` If you do not pass your authentication token you will see that the diffusion system will not be correctly @@ -95,7 +95,7 @@ the weights locally via: ``` git lfs install -git clone https://huggingface.co/runwayml/stable-diffusion-v-1-5 +git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 ``` and then load locally saved weights into the pipeline. This way, you do not need to pass an authentication @@ -125,7 +125,7 @@ you could use it as follows: >>> scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear") >>> generator = StableDiffusionPipeline.from_pretrained( -... "runwayml/stable-diffusion-v-1-5", scheduler=scheduler, use_auth_token=AUTH_TOKEN +... "runwayml/stable-diffusion-v1-5", scheduler=scheduler, use_auth_token=AUTH_TOKEN ... ) ``` diff --git a/docs/source/training/text_inversion.mdx b/docs/source/training/text_inversion.mdx index 381f6bc5f7eb..ec32ec8ec44e 100644 --- a/docs/source/training/text_inversion.mdx +++ b/docs/source/training/text_inversion.mdx @@ -64,7 +64,7 @@ accelerate config ### Cat toy example -You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v-1-5), read the license and tick the checkbox if you agree. +You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). @@ -83,7 +83,7 @@ Now let's get our dataset.Download 3-4 images from [here](https://drive.google.c And launch the training using ```bash -export MODEL_NAME="runwayml/stable-diffusion-v-1-5" +export MODEL_NAME="runwayml/stable-diffusion-v1-5" export DATA_DIR="path-to-dir-containing-images" accelerate launch textual_inversion.py \ diff --git a/docs/source/using-diffusers/custom_pipelines.mdx b/docs/source/using-diffusers/custom_pipelines.mdx index 35253c660143..af973660a86e 100644 --- a/docs/source/using-diffusers/custom_pipelines.mdx +++ b/docs/source/using-diffusers/custom_pipelines.mdx @@ -58,7 +58,7 @@ feature_extractor = CLIPFeatureExtractor.from_pretrained(clip_model_id) clip_model = CLIPModel.from_pretrained(clip_model_id) pipeline = DiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", + "runwayml/stable-diffusion-v1-5", custom_pipeline="clip_guided_stable_diffusion", clip_model=clip_model, feature_extractor=feature_extractor, diff --git a/docs/source/using-diffusers/img2img.mdx b/docs/source/using-diffusers/img2img.mdx index ffff4fffc8db..defefffe063e 100644 --- a/docs/source/using-diffusers/img2img.mdx +++ b/docs/source/using-diffusers/img2img.mdx @@ -25,7 +25,7 @@ from diffusers import StableDiffusionImg2ImgPipeline # load the pipeline device = "cuda" pipe = StableDiffusionImg2ImgPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", revision="fp16", torch_dtype=torch.float16 + "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16 ).to(device) # let's download an initial image diff --git a/examples/community/README.md b/examples/community/README.md index 19c516c4f163..dd0419898fe4 100644 --- a/examples/community/README.md +++ b/examples/community/README.md @@ -16,7 +16,7 @@ If a community doesn't work as expected, please open an issue and ping the autho To load a custom pipeline you just need to pass the `custom_pipeline` argument to `DiffusionPipeline`, as one of the files in `diffusers/examples/community`. Feel free to send a PR with your own pipelines, we will merge them quickly. ```py -pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5", custom_pipeline="filename_in_the_community_folder") +pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", custom_pipeline="filename_in_the_community_folder") ``` ## Example usages @@ -39,7 +39,7 @@ clip_model = CLIPModel.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K", guided_pipeline = DiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", + "runwayml/stable-diffusion-v1-5", custom_pipeline="clip_guided_stable_diffusion", clip_model=clip_model, feature_extractor=feature_extractor, @@ -97,7 +97,7 @@ from diffusers import DiffusionPipeline import torch pipe = DiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", + "runwayml/stable-diffusion-v1-5", revision='fp16', torch_dtype=torch.float16, safety_checker=None, # Very important for videos...lots of false positives while interpolating @@ -139,7 +139,7 @@ def download_image(url): response = requests.get(url) return PIL.Image.open(BytesIO(response.content)).convert("RGB") -pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5", custom_pipeline="stable_diffusion_mega", torch_dtype=torch.float16, revision="fp16") +pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", custom_pipeline="stable_diffusion_mega", torch_dtype=torch.float16, revision="fp16") pipe.to("cuda") pipe.enable_attention_slicing() @@ -202,7 +202,7 @@ from diffusers import DiffusionPipeline import torch pipe = DiffusionPipeline.from_pretrained( - 'runwayml/stable-diffusion-v-1-5', + 'runwayml/stable-diffusion-v1-5', custom_pipeline="lpw_stable_diffusion_onnx", revision="onnx", provider="CUDAExecutionProvider" diff --git a/examples/community/interpolate_stable_diffusion.py b/examples/community/interpolate_stable_diffusion.py index 7ea2c736b28a..1de925d562c2 100644 --- a/examples/community/interpolate_stable_diffusion.py +++ b/examples/community/interpolate_stable_diffusion.py @@ -69,7 +69,7 @@ class StableDiffusionWalkPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/examples/community/lpw_stable_diffusion.py b/examples/community/lpw_stable_diffusion.py index 8c6ed5a465f9..42fcc7330265 100644 --- a/examples/community/lpw_stable_diffusion.py +++ b/examples/community/lpw_stable_diffusion.py @@ -389,7 +389,7 @@ class StableDiffusionLongPromptWeightingPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/examples/community/stable_diffusion_mega.py b/examples/community/stable_diffusion_mega.py index 137af1b303cf..723951941576 100644 --- a/examples/community/stable_diffusion_mega.py +++ b/examples/community/stable_diffusion_mega.py @@ -46,7 +46,7 @@ class StableDiffusionMegaPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionMegaSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/examples/dreambooth/README.md b/examples/dreambooth/README.md index f5d9ee74cd73..c357d5094056 100644 --- a/examples/dreambooth/README.md +++ b/examples/dreambooth/README.md @@ -22,7 +22,7 @@ accelerate config ### Dog toy example -You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v-1-5), read the license and tick the checkbox if you agree. +You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). @@ -41,7 +41,7 @@ Now let's get our dataset. Download images from [here](https://drive.google.com/ And launch the training using ```bash -export MODEL_NAME="runwayml/stable-diffusion-v-1-5" +export MODEL_NAME="runwayml/stable-diffusion-v1-5" export INSTANCE_DIR="path-to-instance-images" export OUTPUT_DIR="path-to-save-model" @@ -65,7 +65,7 @@ Prior-preservation is used to avoid overfitting and language-drift. Refer to the According to the paper, it's recommended to generate `num_epochs * num_samples` images for prior-preservation. 200-300 works well for most cases. ```bash -export MODEL_NAME="runwayml/stable-diffusion-v-1-5" +export MODEL_NAME="runwayml/stable-diffusion-v1-5" export INSTANCE_DIR="path-to-instance-images" export CLASS_DIR="path-to-class-images" export OUTPUT_DIR="path-to-save-model" @@ -95,7 +95,7 @@ With the help of gradient checkpointing and the 8-bit optimizer from bitsandbyte Install `bitsandbytes` with `pip install bitsandbytes` ```bash -export MODEL_NAME="runwayml/stable-diffusion-v-1-5" +export MODEL_NAME="runwayml/stable-diffusion-v1-5" export INSTANCE_DIR="path-to-instance-images" export CLASS_DIR="path-to-class-images" export OUTPUT_DIR="path-to-save-model" @@ -136,7 +136,7 @@ it requires CUDA toolchain with the same version as pytorch. 8-bit optimizer does not seem to be compatible with DeepSpeed at the moment. ```bash -export MODEL_NAME="runwayml/stable-diffusion-v-1-5" +export MODEL_NAME="runwayml/stable-diffusion-v1-5" export INSTANCE_DIR="path-to-instance-images" export CLASS_DIR="path-to-class-images" export OUTPUT_DIR="path-to-save-model" @@ -168,7 +168,7 @@ Pass the `--train_text_encoder` argument to the script to enable training `text_ ___Note: Training text encoder requires more memory, with this option the training won't fit on 16GB GPU. It needs at least 24GB VRAM.___ ```bash -export MODEL_NAME="runwayml/stable-diffusion-v-1-5" +export MODEL_NAME="runwayml/stable-diffusion-v1-5" export INSTANCE_DIR="path-to-instance-images" export CLASS_DIR="path-to-class-images" export OUTPUT_DIR="path-to-save-model" diff --git a/examples/test_examples.py b/examples/test_examples.py index 26f613576195..eb86b18b75f0 100644 --- a/examples/test_examples.py +++ b/examples/test_examples.py @@ -101,7 +101,7 @@ def test_textual_inversion(self): with tempfile.TemporaryDirectory() as tmpdir: test_args = f""" examples/textual_inversion/textual_inversion.py - --pretrained_model_name_or_path runwayml/stable-diffusion-v-1-5 + --pretrained_model_name_or_path runwayml/stable-diffusion-v1-5 --train_data_dir docs/source/imgs --learnable_property object --placeholder_token diff --git a/examples/text_to_image/README.md b/examples/text_to_image/README.md index ad22acc65aab..61945d164c79 100644 --- a/examples/text_to_image/README.md +++ b/examples/text_to_image/README.md @@ -25,7 +25,7 @@ accelerate config ### Pokemon example -You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v-1-5), read the license and tick the checkbox if you agree. +You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). @@ -43,7 +43,7 @@ If you have already cloned the repo, then you won't need to go through these ste With `gradient_checkpointing` and `mixed_precision` it should be possible to fine tune the model on a single 24GB GPU. For higher `batch_size` and faster training it's better to use GPUs with >30GB memory. ```bash -export MODEL_NAME="runwayml/stable-diffusion-v-1-5" +export MODEL_NAME="runwayml/stable-diffusion-v1-5" export dataset_name="lambdalabs/pokemon-blip-captions" accelerate launch train_text_to_image.py \ @@ -67,7 +67,7 @@ To run on your own training files prepare the dataset according to the format re If you wish to use custom loading logic, you should modify the script, we have left pointers for that in the training script. ```bash -export MODEL_NAME="runwayml/stable-diffusion-v-1-5" +export MODEL_NAME="runwayml/stable-diffusion-v1-5" export TRAIN_DIR="path_to_your_dataset" accelerate launch train_text_to_image.py \ diff --git a/examples/textual_inversion/README.md b/examples/textual_inversion/README.md index b1aee2e4d331..3a1e71c54b62 100644 --- a/examples/textual_inversion/README.md +++ b/examples/textual_inversion/README.md @@ -29,7 +29,7 @@ accelerate config ### Cat toy example -You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v-1-5), read the license and tick the checkbox if you agree. +You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). @@ -48,7 +48,7 @@ Now let's get our dataset.Download 3-4 images from [here](https://drive.google.c And launch the training using ```bash -export MODEL_NAME="runwayml/stable-diffusion-v-1-5" +export MODEL_NAME="runwayml/stable-diffusion-v1-5" export DATA_DIR="path-to-dir-containing-images" accelerate launch textual_inversion.py \ diff --git a/src/diffusers/modeling_flax_utils.py b/src/diffusers/modeling_flax_utils.py index 645e623691d8..09cbe6e9eabc 100644 --- a/src/diffusers/modeling_flax_utils.py +++ b/src/diffusers/modeling_flax_utils.py @@ -105,14 +105,14 @@ def to_bf16(self, params: Union[Dict, FrozenDict], mask: Any = None): >>> from diffusers import FlaxUNet2DConditionModel >>> # load model - >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v-1-5") + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5") >>> # By default, the model parameters will be in fp32 precision, to cast these to bfloat16 precision >>> params = model.to_bf16(params) >>> # If you don't want to cast certain parameters (for example layer norm bias and scale) >>> # then pass the mask as follows >>> from flax import traverse_util - >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v-1-5") + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5") >>> flat_params = traverse_util.flatten_dict(params) >>> mask = { ... path: (path[-2] != ("LayerNorm", "bias") and path[-2:] != ("LayerNorm", "scale")) @@ -141,7 +141,7 @@ def to_fp32(self, params: Union[Dict, FrozenDict], mask: Any = None): >>> from diffusers import FlaxUNet2DConditionModel >>> # Download model and configuration from huggingface.co - >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v-1-5") + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5") >>> # By default, the model params will be in fp32, to illustrate the use of this method, >>> # we'll first cast to fp16 and back to fp32 >>> params = model.to_f16(params) @@ -171,14 +171,14 @@ def to_fp16(self, params: Union[Dict, FrozenDict], mask: Any = None): >>> from diffusers import FlaxUNet2DConditionModel >>> # load model - >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v-1-5") + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5") >>> # By default, the model params will be in fp32, to cast these to float16 >>> params = model.to_fp16(params) >>> # If you want don't want to cast certain parameters (for example layer norm bias and scale) >>> # then pass the mask as follows >>> from flax import traverse_util - >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v-1-5") + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5") >>> flat_params = traverse_util.flatten_dict(params) >>> mask = { ... path: (path[-2] != ("LayerNorm", "bias") and path[-2:] != ("LayerNorm", "scale")) @@ -216,7 +216,7 @@ def from_pretrained( - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids are namespaced under a user or organization name, like - `runwayml/stable-diffusion-v-1-5`. + `runwayml/stable-diffusion-v1-5`. - A path to a *directory* containing model weights saved using [`~ModelMixin.save_pretrained`], e.g., `./my_model_directory/`. dtype (`jax.numpy.dtype`, *optional*, defaults to `jax.numpy.float32`): @@ -273,7 +273,7 @@ def from_pretrained( >>> from diffusers import FlaxUNet2DConditionModel >>> # Download model and configuration from huggingface.co and cache. - >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v-1-5") + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5") >>> # Model was saved using *save_pretrained('./test/saved_model/')* (for example purposes, not runnable). >>> model, params = FlaxUNet2DConditionModel.from_pretrained("./test/saved_model/") ```""" diff --git a/src/diffusers/pipeline_flax_utils.py b/src/diffusers/pipeline_flax_utils.py index fee3768ec919..3c23693b40ef 100644 --- a/src/diffusers/pipeline_flax_utils.py +++ b/src/diffusers/pipeline_flax_utils.py @@ -244,7 +244,7 @@ def from_pretrained(cls, pretrained_model_name_or_path: Optional[Union[str, os.P It is required to be logged in (`huggingface-cli login`) when you want to use private or [gated - models](https://huggingface.co/docs/hub/models-gated#gated-models), *e.g.* `"runwayml/stable-diffusion-v-1-5"` + models](https://huggingface.co/docs/hub/models-gated#gated-models), *e.g.* `"runwayml/stable-diffusion-v1-5"` @@ -266,13 +266,13 @@ def from_pretrained(cls, pretrained_model_name_or_path: Optional[Union[str, os.P >>> # Download pipeline that requires an authorization token >>> # For more information on access tokens, please refer to this section >>> # of the documentation](https://huggingface.co/docs/hub/security-tokens) - >>> pipeline = FlaxDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5") + >>> pipeline = FlaxDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5") >>> # Download pipeline, but overwrite scheduler >>> from diffusers import LMSDiscreteScheduler >>> scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear") - >>> pipeline = FlaxDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5", scheduler=scheduler) + >>> pipeline = FlaxDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", scheduler=scheduler) ``` """ cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) diff --git a/src/diffusers/pipeline_utils.py b/src/diffusers/pipeline_utils.py index 26a49952706b..c61a9583ca1c 100644 --- a/src/diffusers/pipeline_utils.py +++ b/src/diffusers/pipeline_utils.py @@ -317,7 +317,7 @@ def from_pretrained(cls, pretrained_model_name_or_path: Optional[Union[str, os.P It is required to be logged in (`huggingface-cli login`) when you want to use private or [gated - models](https://huggingface.co/docs/hub/models-gated#gated-models), *e.g.* `"runwayml/stable-diffusion-v-1-5"` + models](https://huggingface.co/docs/hub/models-gated#gated-models), *e.g.* `"runwayml/stable-diffusion-v1-5"` @@ -339,13 +339,13 @@ def from_pretrained(cls, pretrained_model_name_or_path: Optional[Union[str, os.P >>> # Download pipeline that requires an authorization token >>> # For more information on access tokens, please refer to this section >>> # of the documentation](https://huggingface.co/docs/hub/security-tokens) - >>> pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5") + >>> pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5") >>> # Download pipeline, but overwrite scheduler >>> from diffusers import LMSDiscreteScheduler >>> scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear") - >>> pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5", scheduler=scheduler) + >>> pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", scheduler=scheduler) ``` """ cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) diff --git a/src/diffusers/pipelines/README.md b/src/diffusers/pipelines/README.md index 9d1129d13ef6..66d3a8815d4e 100644 --- a/src/diffusers/pipelines/README.md +++ b/src/diffusers/pipelines/README.md @@ -55,8 +55,8 @@ Diffusion models often consist of multiple independently-trained models or other Each model has been trained independently on a different task and the scheduler can easily be swapped out and replaced with a different one. During inference, we however want to be able to easily load all components and use them in inference - even if one component, *e.g.* CLIP's text encoder, originates from a different library, such as [Transformers](https://github.com/huggingface/transformers). To that end, all pipelines provide the following functionality: -- [`from_pretrained` method](https://github.com/huggingface/diffusers/blob/5cbed8e0d157f65d3ddc2420dfd09f2df630e978/src/diffusers/pipeline_utils.py#L139) that accepts a Hugging Face Hub repository id, *e.g.* [runwayml/stable-diffusion-v-1-5](https://huggingface.co/runwayml/stable-diffusion-v-1-5) or a path to a local directory, *e.g.* -"./stable-diffusion". To correctly retrieve which models and components should be loaded, one has to provide a `model_index.json` file, *e.g.* [runwayml/stable-diffusion-v-1-5/model_index.json](https://huggingface.co/runwayml/stable-diffusion-v-1-5/blob/main/model_index.json), which defines all components that should be +- [`from_pretrained` method](https://github.com/huggingface/diffusers/blob/5cbed8e0d157f65d3ddc2420dfd09f2df630e978/src/diffusers/pipeline_utils.py#L139) that accepts a Hugging Face Hub repository id, *e.g.* [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5) or a path to a local directory, *e.g.* +"./stable-diffusion". To correctly retrieve which models and components should be loaded, one has to provide a `model_index.json` file, *e.g.* [runwayml/stable-diffusion-v1-5/model_index.json](https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/model_index.json), which defines all components that should be loaded into the pipelines. More specifically, for each model/component one needs to define the format `: ["", ""]`. `` is the attribute name given to the loaded instance of `` which can be found in the library or pipeline folder called `""`. - [`save_pretrained`](https://github.com/huggingface/diffusers/blob/5cbed8e0d157f65d3ddc2420dfd09f2df630e978/src/diffusers/pipeline_utils.py#L90) that accepts a local path, *e.g.* `./stable-diffusion` under which all models/components of the pipeline will be saved. For each component/model a folder is created inside the local path that is named after the given attribute name, *e.g.* `./stable_diffusion/unet`. In addition, a `model_index.json` file is created at the root of the local path, *e.g.* `./stable_diffusion/model_index.json` so that the complete pipeline can again be instantiated @@ -88,7 +88,7 @@ logic including pre-processing, an unrolled diffusion loop, and post-processing # make sure you're logged in with `huggingface-cli login` from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler -pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5") +pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5") pipe = pipe.to("cuda") prompt = "a photo of an astronaut riding a horse on mars" @@ -111,7 +111,7 @@ from diffusers import StableDiffusionImg2ImgPipeline # load the pipeline device = "cuda" pipe = StableDiffusionImg2ImgPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", + "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16, ).to(device) diff --git a/src/diffusers/pipelines/stable_diffusion/README.md b/src/diffusers/pipelines/stable_diffusion/README.md index 9cfd82c629fe..5e8665b3f9d9 100644 --- a/src/diffusers/pipelines/stable_diffusion/README.md +++ b/src/diffusers/pipelines/stable_diffusion/README.md @@ -13,7 +13,7 @@ The summary of the model is the following: - Stable Diffusion has the same architecture as [Latent Diffusion](https://arxiv.org/abs/2112.10752) but uses a frozen CLIP Text Encoder instead of training the text encoder jointly with the diffusion model. - An in-detail explanation of the Stable Diffusion model can be found under [Stable Diffusion with ๐Ÿงจ Diffusers](https://huggingface.co/blog/stable_diffusion). - If you don't want to rely on the Hugging Face Hub and having to pass a authentication token, you can -download the weights with `git lfs install; git clone https://huggingface.co/runwayml/stable-diffusion-v-1-5` and instead pass the local path to the cloned folder to `from_pretrained` as shown below. +download the weights with `git lfs install; git clone https://huggingface.co/runwayml/stable-diffusion-v1-5` and instead pass the local path to the cloned folder to `from_pretrained` as shown below. - Stable Diffusion can work with a variety of different samplers as is shown below. ## Available Pipelines: @@ -33,14 +33,14 @@ If you want to download the model weights using a single Python line, you need t ```python from diffusers import DiffusionPipeline -pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5") +pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5") ``` This however can make it difficult to build applications on top of `diffusers` as you will always have to pass the token around. A potential way to solve this issue is by downloading the weights to a local path `"./stable-diffusion-v1-4"`: ``` git lfs install -git clone https://huggingface.co/runwayml/stable-diffusion-v-1-5 +git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 ``` and simply passing the local path to `from_pretrained`: @@ -57,7 +57,7 @@ pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-4") # make sure you're logged in with `huggingface-cli login` from diffusers import StableDiffusionPipeline -pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v-1-5") +pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5") pipe = pipe.to("cuda") prompt = "a photo of an astronaut riding a horse on mars" @@ -75,7 +75,7 @@ from diffusers import StableDiffusionPipeline, DDIMScheduler scheduler = DDIMScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", clip_sample=False, set_alpha_to_one=False) pipe = StableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", + "runwayml/stable-diffusion-v1-5", scheduler=scheduler, ).to("cuda") @@ -98,7 +98,7 @@ lms = LMSDiscreteScheduler( ) pipe = StableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", + "runwayml/stable-diffusion-v1-5", scheduler=lms, ).to("cuda") diff --git a/src/diffusers/pipelines/stable_diffusion/pipeline_flax_stable_diffusion.py b/src/diffusers/pipelines/stable_diffusion/pipeline_flax_stable_diffusion.py index fb7c91c01342..e4f56d94dac8 100644 --- a/src/diffusers/pipelines/stable_diffusion/pipeline_flax_stable_diffusion.py +++ b/src/diffusers/pipelines/stable_diffusion/pipeline_flax_stable_diffusion.py @@ -45,7 +45,7 @@ class FlaxStableDiffusionPipeline(FlaxDiffusionPipeline): [`FlaxDDIMScheduler`], [`FlaxLMSDiscreteScheduler`], or [`FlaxPNDMScheduler`]. safety_checker ([`FlaxStableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_img2img.py b/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_img2img.py index 99783fc19b0d..ce3f3fbacbc7 100644 --- a/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_img2img.py +++ b/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_img2img.py @@ -50,7 +50,7 @@ class OnnxStableDiffusionImg2ImgPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_inpaint.py b/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_inpaint.py index 3c62bff67c0d..b45d968f66e3 100644 --- a/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_inpaint.py +++ b/src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_inpaint.py @@ -63,7 +63,7 @@ class OnnxStableDiffusionInpaintPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py index eeb9c909b8b0..46def26e94cc 100644 --- a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py +++ b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py @@ -40,7 +40,7 @@ class StableDiffusionPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py index 06705d64ead1..db8134834564 100644 --- a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py +++ b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py @@ -52,7 +52,7 @@ class StableDiffusionImg2ImgPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py index 214118df5811..918a1241f996 100644 --- a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py +++ b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py @@ -59,7 +59,7 @@ class StableDiffusionInpaintPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint_legacy.py b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint_legacy.py index 0d3b8c19ac28..71a2be62128c 100644 --- a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint_legacy.py +++ b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint_legacy.py @@ -66,7 +66,7 @@ class StableDiffusionInpaintPipelineLegacy(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v-1-5) for details. + Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/tests/test_pipelines.py b/tests/test_pipelines.py index b4c4885cacb1..0dd4ae703626 100644 --- a/tests/test_pipelines.py +++ b/tests/test_pipelines.py @@ -127,7 +127,7 @@ def test_load_pipeline_from_git(self): clip_model = CLIPModel.from_pretrained(clip_model_id, torch_dtype=torch.float16) pipeline = DiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", + "runwayml/stable-diffusion-v1-5", custom_pipeline="clip_guided_stable_diffusion", clip_model=clip_model, feature_extractor=feature_extractor, @@ -1819,7 +1819,7 @@ def test_lms_stable_diffusion_pipeline(self): @unittest.skipIf(torch_device == "cpu", "Stable diffusion is supposed to run on GPU") def test_stable_diffusion_memory_chunking(self): torch.cuda.reset_peak_memory_stats() - model_id = "runwayml/stable-diffusion-v-1-5" + model_id = "runwayml/stable-diffusion-v1-5" pipe = StableDiffusionPipeline.from_pretrained(model_id, revision="fp16", torch_dtype=torch.float16).to( torch_device ) @@ -1859,7 +1859,7 @@ def test_stable_diffusion_memory_chunking(self): @unittest.skipIf(torch_device == "cpu", "Stable diffusion is supposed to run on GPU") def test_stable_diffusion_text2img_pipeline_fp16(self): torch.cuda.reset_peak_memory_stats() - model_id = "runwayml/stable-diffusion-v-1-5" + model_id = "runwayml/stable-diffusion-v1-5" pipe = StableDiffusionPipeline.from_pretrained(model_id, revision="fp16", torch_dtype=torch.float16).to( torch_device ) @@ -1895,7 +1895,7 @@ def test_stable_diffusion_text2img_pipeline(self): ) expected_image = np.array(expected_image, dtype=np.float32) / 255.0 - model_id = "runwayml/stable-diffusion-v-1-5" + model_id = "runwayml/stable-diffusion-v1-5" pipe = StableDiffusionPipeline.from_pretrained( model_id, safety_checker=self.dummy_safety_checker, @@ -1927,7 +1927,7 @@ def test_stable_diffusion_img2img_pipeline(self): init_image = init_image.resize((768, 512)) expected_image = np.array(expected_image, dtype=np.float32) / 255.0 - model_id = "runwayml/stable-diffusion-v-1-5" + model_id = "runwayml/stable-diffusion-v1-5" pipe = StableDiffusionImg2ImgPipeline.from_pretrained( model_id, safety_checker=self.dummy_safety_checker, @@ -1969,7 +1969,7 @@ def test_stable_diffusion_img2img_pipeline_k_lms(self): lms = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear") - model_id = "runwayml/stable-diffusion-v-1-5" + model_id = "runwayml/stable-diffusion-v1-5" pipe = StableDiffusionImg2ImgPipeline.from_pretrained( model_id, scheduler=lms, @@ -2097,7 +2097,7 @@ def test_stable_diffusion_inpaint_legacy_pipeline(self): ) expected_image = np.array(expected_image, dtype=np.float32) / 255.0 - model_id = "runwayml/stable-diffusion-v-1-5" + model_id = "runwayml/stable-diffusion-v1-5" pipe = StableDiffusionInpaintPipeline.from_pretrained( model_id, safety_checker=self.dummy_safety_checker, @@ -2184,7 +2184,7 @@ def test_stable_diffusion_inpaint_legacy_pipeline_k_lms(self): lms = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear") - model_id = "runwayml/stable-diffusion-v-1-5" + model_id = "runwayml/stable-diffusion-v1-5" pipe = StableDiffusionInpaintPipeline.from_pretrained( model_id, scheduler=lms, @@ -2214,7 +2214,7 @@ def test_stable_diffusion_inpaint_legacy_pipeline_k_lms(self): @slow def test_stable_diffusion_onnx(self): sd_pipe = OnnxStableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", revision="onnx", provider="CPUExecutionProvider" + "runwayml/stable-diffusion-v1-5", revision="onnx", provider="CPUExecutionProvider" ) prompt = "A painting of a squirrel eating a burger" @@ -2236,7 +2236,7 @@ def test_stable_diffusion_img2img_onnx(self): ) init_image = init_image.resize((768, 512)) pipe = OnnxStableDiffusionImg2ImgPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", revision="onnx", provider="CPUExecutionProvider" + "runwayml/stable-diffusion-v1-5", revision="onnx", provider="CPUExecutionProvider" ) pipe.set_progress_bar_config(disable=None) @@ -2322,7 +2322,7 @@ def test_callback_fn(step: int, timestep: int, latents: torch.FloatTensor) -> No test_callback_fn.has_been_called = False pipe = StableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", revision="fp16", torch_dtype=torch.float16 + "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16 ) pipe = pipe.to(torch_device) pipe.set_progress_bar_config(disable=None) @@ -2374,7 +2374,7 @@ def test_callback_fn(step: int, timestep: int, latents: torch.FloatTensor) -> No init_image = init_image.resize((768, 512)) pipe = StableDiffusionImg2ImgPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", revision="fp16", torch_dtype=torch.float16 + "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16 ) pipe.to(torch_device) pipe.set_progress_bar_config(disable=None) @@ -2433,7 +2433,7 @@ def test_callback_fn(step: int, timestep: int, latents: torch.FloatTensor) -> No ) pipe = StableDiffusionInpaintPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", revision="fp16", torch_dtype=torch.float16 + "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16 ) pipe.to(torch_device) pipe.set_progress_bar_config(disable=None) @@ -2483,7 +2483,7 @@ def test_callback_fn(step: int, timestep: int, latents: np.ndarray) -> None: test_callback_fn.has_been_called = False pipe = OnnxStableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", revision="onnx", provider="CPUExecutionProvider" + "runwayml/stable-diffusion-v1-5", revision="onnx", provider="CPUExecutionProvider" ) pipe.set_progress_bar_config(disable=None) @@ -2503,7 +2503,7 @@ def test_stable_diffusion_accelerate_load_works(self): if version.parse(version.parse(accelerate.__version__).base_version) < version.parse("0.14"): return - model_id = "runwayml/stable-diffusion-v-1-5" + model_id = "runwayml/stable-diffusion-v1-5" _ = StableDiffusionPipeline.from_pretrained( model_id, revision="fp16", torch_dtype=torch.float16, use_auth_token=True, device_map="auto" ).to(torch_device) @@ -2517,7 +2517,7 @@ def test_stable_diffusion_accelerate_load_reduces_memory_footprint(self): if version.parse(version.parse(accelerate.__version__).base_version) < version.parse("0.14"): return - pipeline_id = "runwayml/stable-diffusion-v-1-5" + pipeline_id = "runwayml/stable-diffusion-v1-5" torch.cuda.empty_cache() gc.collect() diff --git a/tests/test_pipelines_flax.py b/tests/test_pipelines_flax.py index d269c673ac31..6b3746e1f057 100644 --- a/tests/test_pipelines_flax.py +++ b/tests/test_pipelines_flax.py @@ -69,7 +69,7 @@ def test_dummy_all_tpus(self): def test_stable_diffusion_v1_4(self): pipeline, params = FlaxStableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", revision="flax", safety_checker=None + "runwayml/stable-diffusion-v1-5", revision="flax", safety_checker=None ) prompt = ( @@ -99,7 +99,7 @@ def test_stable_diffusion_v1_4(self): def test_stable_diffusion_v1_4_bfloat_16(self): pipeline, params = FlaxStableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", revision="bf16", dtype=jnp.bfloat16, safety_checker=None + "runwayml/stable-diffusion-v1-5", revision="bf16", dtype=jnp.bfloat16, safety_checker=None ) prompt = ( @@ -129,7 +129,7 @@ def test_stable_diffusion_v1_4_bfloat_16(self): def test_stable_diffusion_v1_4_bfloat_16_with_safety(self): pipeline, params = FlaxStableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", revision="bf16", dtype=jnp.bfloat16 + "runwayml/stable-diffusion-v1-5", revision="bf16", dtype=jnp.bfloat16 ) prompt = ( @@ -165,7 +165,7 @@ def test_stable_diffusion_v1_4_bfloat_16_ddim(self): ) pipeline, params = FlaxStableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v-1-5", + "runwayml/stable-diffusion-v1-5", revision="bf16", dtype=jnp.bfloat16, scheduler=scheduler, From af6588678fee36c14f0942ef25a97b89490ab4d5 Mon Sep 17 00:00:00 2001 From: anton-l Date: Thu, 20 Oct 2022 12:28:30 +0200 Subject: [PATCH 04/12] revert test changes --- docs/source/quicktour.mdx | 4 +-- .../pipelines/stable_diffusion/README.md | 4 +-- tests/test_pipelines.py | 32 +++++++++---------- tests/test_pipelines_flax.py | 8 ++--- 4 files changed, 24 insertions(+), 24 deletions(-) diff --git a/docs/source/quicktour.mdx b/docs/source/quicktour.mdx index f17745899c55..c2bcf524cef1 100644 --- a/docs/source/quicktour.mdx +++ b/docs/source/quicktour.mdx @@ -99,11 +99,11 @@ git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 ``` and then load locally saved weights into the pipeline. This way, you do not need to pass an authentication -token. Assuming that `"./stable-diffusion-v1-4"` is the local path to the cloned stable-diffusion-v1-4 repo, +token. Assuming that `"./stable-diffusion-v1-5"` is the local path to the cloned stable-diffusion-v1-5 repo, you can also load the pipeline as follows: ```python ->>> generator = DiffusionPipeline.from_pretrained("./stable-diffusion-v1-4") +>>> generator = DiffusionPipeline.from_pretrained("./stable-diffusion-v1-5") ``` Running the pipeline is then identical to the code above as it's the same model architecture. diff --git a/src/diffusers/pipelines/stable_diffusion/README.md b/src/diffusers/pipelines/stable_diffusion/README.md index 5e8665b3f9d9..9c92e7b1a354 100644 --- a/src/diffusers/pipelines/stable_diffusion/README.md +++ b/src/diffusers/pipelines/stable_diffusion/README.md @@ -36,7 +36,7 @@ from diffusers import DiffusionPipeline pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5") ``` -This however can make it difficult to build applications on top of `diffusers` as you will always have to pass the token around. A potential way to solve this issue is by downloading the weights to a local path `"./stable-diffusion-v1-4"`: +This however can make it difficult to build applications on top of `diffusers` as you will always have to pass the token around. A potential way to solve this issue is by downloading the weights to a local path `"./stable-diffusion-v1-5"`: ``` git lfs install @@ -48,7 +48,7 @@ and simply passing the local path to `from_pretrained`: ```python from diffusers import StableDiffusionPipeline -pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-4") +pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-5") ``` ### Text-to-Image with default PLMS scheduler diff --git a/tests/test_pipelines.py b/tests/test_pipelines.py index 0dd4ae703626..357fb11a0e8c 100644 --- a/tests/test_pipelines.py +++ b/tests/test_pipelines.py @@ -127,7 +127,7 @@ def test_load_pipeline_from_git(self): clip_model = CLIPModel.from_pretrained(clip_model_id, torch_dtype=torch.float16) pipeline = DiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v1-5", + "CompVis/stable-diffusion-v1-4", custom_pipeline="clip_guided_stable_diffusion", clip_model=clip_model, feature_extractor=feature_extractor, @@ -1819,7 +1819,7 @@ def test_lms_stable_diffusion_pipeline(self): @unittest.skipIf(torch_device == "cpu", "Stable diffusion is supposed to run on GPU") def test_stable_diffusion_memory_chunking(self): torch.cuda.reset_peak_memory_stats() - model_id = "runwayml/stable-diffusion-v1-5" + model_id = "CompVis/stable-diffusion-v1-4" pipe = StableDiffusionPipeline.from_pretrained(model_id, revision="fp16", torch_dtype=torch.float16).to( torch_device ) @@ -1859,7 +1859,7 @@ def test_stable_diffusion_memory_chunking(self): @unittest.skipIf(torch_device == "cpu", "Stable diffusion is supposed to run on GPU") def test_stable_diffusion_text2img_pipeline_fp16(self): torch.cuda.reset_peak_memory_stats() - model_id = "runwayml/stable-diffusion-v1-5" + model_id = "CompVis/stable-diffusion-v1-4" pipe = StableDiffusionPipeline.from_pretrained(model_id, revision="fp16", torch_dtype=torch.float16).to( torch_device ) @@ -1895,7 +1895,7 @@ def test_stable_diffusion_text2img_pipeline(self): ) expected_image = np.array(expected_image, dtype=np.float32) / 255.0 - model_id = "runwayml/stable-diffusion-v1-5" + model_id = "CompVis/stable-diffusion-v1-4" pipe = StableDiffusionPipeline.from_pretrained( model_id, safety_checker=self.dummy_safety_checker, @@ -1927,7 +1927,7 @@ def test_stable_diffusion_img2img_pipeline(self): init_image = init_image.resize((768, 512)) expected_image = np.array(expected_image, dtype=np.float32) / 255.0 - model_id = "runwayml/stable-diffusion-v1-5" + model_id = "CompVis/stable-diffusion-v1-4" pipe = StableDiffusionImg2ImgPipeline.from_pretrained( model_id, safety_checker=self.dummy_safety_checker, @@ -1969,7 +1969,7 @@ def test_stable_diffusion_img2img_pipeline_k_lms(self): lms = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear") - model_id = "runwayml/stable-diffusion-v1-5" + model_id = "CompVis/stable-diffusion-v1-4" pipe = StableDiffusionImg2ImgPipeline.from_pretrained( model_id, scheduler=lms, @@ -2097,7 +2097,7 @@ def test_stable_diffusion_inpaint_legacy_pipeline(self): ) expected_image = np.array(expected_image, dtype=np.float32) / 255.0 - model_id = "runwayml/stable-diffusion-v1-5" + model_id = "CompVis/stable-diffusion-v1-4" pipe = StableDiffusionInpaintPipeline.from_pretrained( model_id, safety_checker=self.dummy_safety_checker, @@ -2184,7 +2184,7 @@ def test_stable_diffusion_inpaint_legacy_pipeline_k_lms(self): lms = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear") - model_id = "runwayml/stable-diffusion-v1-5" + model_id = "CompVis/stable-diffusion-v1-4" pipe = StableDiffusionInpaintPipeline.from_pretrained( model_id, scheduler=lms, @@ -2214,7 +2214,7 @@ def test_stable_diffusion_inpaint_legacy_pipeline_k_lms(self): @slow def test_stable_diffusion_onnx(self): sd_pipe = OnnxStableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v1-5", revision="onnx", provider="CPUExecutionProvider" + "CompVis/stable-diffusion-v1-4", revision="onnx", provider="CPUExecutionProvider" ) prompt = "A painting of a squirrel eating a burger" @@ -2236,7 +2236,7 @@ def test_stable_diffusion_img2img_onnx(self): ) init_image = init_image.resize((768, 512)) pipe = OnnxStableDiffusionImg2ImgPipeline.from_pretrained( - "runwayml/stable-diffusion-v1-5", revision="onnx", provider="CPUExecutionProvider" + "CompVis/stable-diffusion-v1-4", revision="onnx", provider="CPUExecutionProvider" ) pipe.set_progress_bar_config(disable=None) @@ -2322,7 +2322,7 @@ def test_callback_fn(step: int, timestep: int, latents: torch.FloatTensor) -> No test_callback_fn.has_been_called = False pipe = StableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16 + "CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16 ) pipe = pipe.to(torch_device) pipe.set_progress_bar_config(disable=None) @@ -2374,7 +2374,7 @@ def test_callback_fn(step: int, timestep: int, latents: torch.FloatTensor) -> No init_image = init_image.resize((768, 512)) pipe = StableDiffusionImg2ImgPipeline.from_pretrained( - "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16 + "CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16 ) pipe.to(torch_device) pipe.set_progress_bar_config(disable=None) @@ -2433,7 +2433,7 @@ def test_callback_fn(step: int, timestep: int, latents: torch.FloatTensor) -> No ) pipe = StableDiffusionInpaintPipeline.from_pretrained( - "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16 + "CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16 ) pipe.to(torch_device) pipe.set_progress_bar_config(disable=None) @@ -2483,7 +2483,7 @@ def test_callback_fn(step: int, timestep: int, latents: np.ndarray) -> None: test_callback_fn.has_been_called = False pipe = OnnxStableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v1-5", revision="onnx", provider="CPUExecutionProvider" + "CompVis/stable-diffusion-v1-4", revision="onnx", provider="CPUExecutionProvider" ) pipe.set_progress_bar_config(disable=None) @@ -2503,7 +2503,7 @@ def test_stable_diffusion_accelerate_load_works(self): if version.parse(version.parse(accelerate.__version__).base_version) < version.parse("0.14"): return - model_id = "runwayml/stable-diffusion-v1-5" + model_id = "CompVis/stable-diffusion-v1-4" _ = StableDiffusionPipeline.from_pretrained( model_id, revision="fp16", torch_dtype=torch.float16, use_auth_token=True, device_map="auto" ).to(torch_device) @@ -2517,7 +2517,7 @@ def test_stable_diffusion_accelerate_load_reduces_memory_footprint(self): if version.parse(version.parse(accelerate.__version__).base_version) < version.parse("0.14"): return - pipeline_id = "runwayml/stable-diffusion-v1-5" + pipeline_id = "CompVis/stable-diffusion-v1-4" torch.cuda.empty_cache() gc.collect() diff --git a/tests/test_pipelines_flax.py b/tests/test_pipelines_flax.py index 6b3746e1f057..9256944815c7 100644 --- a/tests/test_pipelines_flax.py +++ b/tests/test_pipelines_flax.py @@ -69,7 +69,7 @@ def test_dummy_all_tpus(self): def test_stable_diffusion_v1_4(self): pipeline, params = FlaxStableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v1-5", revision="flax", safety_checker=None + "CompVis/stable-diffusion-v1-4", revision="flax", safety_checker=None ) prompt = ( @@ -99,7 +99,7 @@ def test_stable_diffusion_v1_4(self): def test_stable_diffusion_v1_4_bfloat_16(self): pipeline, params = FlaxStableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v1-5", revision="bf16", dtype=jnp.bfloat16, safety_checker=None + "CompVis/stable-diffusion-v1-4", revision="bf16", dtype=jnp.bfloat16, safety_checker=None ) prompt = ( @@ -129,7 +129,7 @@ def test_stable_diffusion_v1_4_bfloat_16(self): def test_stable_diffusion_v1_4_bfloat_16_with_safety(self): pipeline, params = FlaxStableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v1-5", revision="bf16", dtype=jnp.bfloat16 + "CompVis/stable-diffusion-v1-4", revision="bf16", dtype=jnp.bfloat16 ) prompt = ( @@ -165,7 +165,7 @@ def test_stable_diffusion_v1_4_bfloat_16_ddim(self): ) pipeline, params = FlaxStableDiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v1-5", + "CompVis/stable-diffusion-v1-4", revision="bf16", dtype=jnp.bfloat16, scheduler=scheduler, From 3d315605d384683a3bab3df4729f97bcf5d020e5 Mon Sep 17 00:00:00 2001 From: apolinario Date: Fri, 21 Oct 2022 12:47:08 +0200 Subject: [PATCH 05/12] Update README.md Co-authored-by: Patrick von Platen --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index c0bc1910c364..d14eb33132cc 100644 --- a/README.md +++ b/README.md @@ -241,7 +241,7 @@ prompt_ids = pipeline.prepare_inputs(prompt) # shard inputs and rng params = replicate(params) -prng_seed = jax.random.split(prng_seed, 8) +prng_seed = jax.random.split(prng_seed, jax.device_count()) prompt_ids = shard(prompt_ids) images = pipeline(prompt_ids, params, prng_seed, num_inference_steps, jit=True).images From 35086d25c83c1c5129877978e7fd4031c519866b Mon Sep 17 00:00:00 2001 From: apolinario Date: Fri, 21 Oct 2022 12:47:49 +0200 Subject: [PATCH 06/12] Update docs/source/quicktour.mdx Co-authored-by: Pedro Cuenca --- docs/source/quicktour.mdx | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/source/quicktour.mdx b/docs/source/quicktour.mdx index c2bcf524cef1..6476f42c4593 100644 --- a/docs/source/quicktour.mdx +++ b/docs/source/quicktour.mdx @@ -69,7 +69,6 @@ You can save the image by simply calling: More advanced models, like [Stable Diffusion](https://huggingface.co/CompVis/stable-diffusion) require you to accept a [license](https://huggingface.co/spaces/CompVis/stable-diffusion-license) before running the model. This is due to the improved image generation capabilities of the model and the potentially harmful content that could be produced with it. Long story short: Head over to your stable diffusion model of choice, *e.g.* [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5), read through the license and click-accept to get -access to the model. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). Having "click-accepted" the license, you can save your token: From 6a0eb1c9519fb6184c7caf2e5a121ad57868b2fc Mon Sep 17 00:00:00 2001 From: apolinario Date: Fri, 21 Oct 2022 12:48:07 +0200 Subject: [PATCH 07/12] Update README.md Co-authored-by: Pedro Cuenca --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index d14eb33132cc..9a4fa0298fbb 100644 --- a/README.md +++ b/README.md @@ -69,7 +69,7 @@ In order to get started, we recommend taking a look at two notebooks: Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/), [LAION](https://laion.ai/) and [RunwayML](https://runwayml.com/). It's trained on 512x512 images from a subset of the [LAION-5B](https://laion.ai/blog/laion-5b/) database. This model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM. See the [model card](https://huggingface.co/CompVis/stable-diffusion) for more information. -You need to accept the model license before downloading or using the Stable Diffusion weights. Please, visit the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section](https://huggingface.co/docs/hub/security-tokens) of the documentation. +You need to accept the model license before downloading or using the Stable Diffusion weights. Please, visit the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license carefully and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section](https://huggingface.co/docs/hub/security-tokens) of the documentation. ### Text-to-Image generation with Stable Diffusion From e943d9cb3868041319942f0ed2e157e2d6e67538 Mon Sep 17 00:00:00 2001 From: apolinario Date: Fri, 21 Oct 2022 12:48:31 +0200 Subject: [PATCH 08/12] Update docs/source/quicktour.mdx Co-authored-by: Pedro Cuenca --- docs/source/quicktour.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/quicktour.mdx b/docs/source/quicktour.mdx index 6476f42c4593..a652695f00ba 100644 --- a/docs/source/quicktour.mdx +++ b/docs/source/quicktour.mdx @@ -68,7 +68,7 @@ You can save the image by simply calling: More advanced models, like [Stable Diffusion](https://huggingface.co/CompVis/stable-diffusion) require you to accept a [license](https://huggingface.co/spaces/CompVis/stable-diffusion-license) before running the model. This is due to the improved image generation capabilities of the model and the potentially harmful content that could be produced with it. -Long story short: Head over to your stable diffusion model of choice, *e.g.* [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5), read through the license and click-accept to get +Please, head over to your stable diffusion model of choice, *e.g.* [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license carefully and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). Having "click-accepted" the license, you can save your token: From a2ff98981d76780d9ce6e7c6f0b75fceade11344 Mon Sep 17 00:00:00 2001 From: apolinario Date: Fri, 21 Oct 2022 12:49:03 +0200 Subject: [PATCH 09/12] Update README.md Co-authored-by: Suraj Patil --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 9a4fa0298fbb..2a294e2a30ae 100644 --- a/README.md +++ b/README.md @@ -208,7 +208,7 @@ prompt_ids = pipeline.prepare_inputs(prompt) # shard inputs and rng params = replicate(params) -prng_seed = jax.random.split(prng_seed, 8) +prng_seed = jax.random.split(prng_seed, jax.device_count()) prompt_ids = shard(prompt_ids) images = pipeline(prompt_ids, params, prng_seed, num_inference_steps, jit=True).images From 6f0d34ef9c418c3d5fe77bd59647c4f2b5139d55 Mon Sep 17 00:00:00 2001 From: apolinario Date: Fri, 21 Oct 2022 12:50:04 +0200 Subject: [PATCH 10/12] Revert certain references to v1-5 --- docs/source/training/text_inversion.mdx | 2 +- examples/community/README.md | 4 ++-- examples/dreambooth/README.md | 12 ++++++------ examples/text_to_image/README.md | 6 +++--- examples/textual_inversion/README.md | 2 +- 5 files changed, 13 insertions(+), 13 deletions(-) diff --git a/docs/source/training/text_inversion.mdx b/docs/source/training/text_inversion.mdx index ec32ec8ec44e..d44cef301ed7 100644 --- a/docs/source/training/text_inversion.mdx +++ b/docs/source/training/text_inversion.mdx @@ -64,7 +64,7 @@ accelerate config ### Cat toy example -You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. +You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). diff --git a/examples/community/README.md b/examples/community/README.md index dd0419898fe4..ed27ce54a482 100644 --- a/examples/community/README.md +++ b/examples/community/README.md @@ -97,7 +97,7 @@ from diffusers import DiffusionPipeline import torch pipe = DiffusionPipeline.from_pretrained( - "runwayml/stable-diffusion-v1-5", + "CompVis/stable-diffusion-v1-4", revision='fp16', torch_dtype=torch.float16, safety_checker=None, # Very important for videos...lots of false positives while interpolating @@ -202,7 +202,7 @@ from diffusers import DiffusionPipeline import torch pipe = DiffusionPipeline.from_pretrained( - 'runwayml/stable-diffusion-v1-5', + 'CompVis/stable-diffusion-v1-4', custom_pipeline="lpw_stable_diffusion_onnx", revision="onnx", provider="CUDAExecutionProvider" diff --git a/examples/dreambooth/README.md b/examples/dreambooth/README.md index c357d5094056..62ab1b88c0cb 100644 --- a/examples/dreambooth/README.md +++ b/examples/dreambooth/README.md @@ -22,7 +22,7 @@ accelerate config ### Dog toy example -You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. +You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). @@ -41,7 +41,7 @@ Now let's get our dataset. Download images from [here](https://drive.google.com/ And launch the training using ```bash -export MODEL_NAME="runwayml/stable-diffusion-v1-5" +export MODEL_NAME="CompVis/stable-diffusion-v1-4" export INSTANCE_DIR="path-to-instance-images" export OUTPUT_DIR="path-to-save-model" @@ -65,7 +65,7 @@ Prior-preservation is used to avoid overfitting and language-drift. Refer to the According to the paper, it's recommended to generate `num_epochs * num_samples` images for prior-preservation. 200-300 works well for most cases. ```bash -export MODEL_NAME="runwayml/stable-diffusion-v1-5" +export MODEL_NAME="CompVis/stable-diffusion-v1-4" export INSTANCE_DIR="path-to-instance-images" export CLASS_DIR="path-to-class-images" export OUTPUT_DIR="path-to-save-model" @@ -95,7 +95,7 @@ With the help of gradient checkpointing and the 8-bit optimizer from bitsandbyte Install `bitsandbytes` with `pip install bitsandbytes` ```bash -export MODEL_NAME="runwayml/stable-diffusion-v1-5" +export MODEL_NAME="CompVis/stable-diffusion-v1-4" export INSTANCE_DIR="path-to-instance-images" export CLASS_DIR="path-to-class-images" export OUTPUT_DIR="path-to-save-model" @@ -136,7 +136,7 @@ it requires CUDA toolchain with the same version as pytorch. 8-bit optimizer does not seem to be compatible with DeepSpeed at the moment. ```bash -export MODEL_NAME="runwayml/stable-diffusion-v1-5" +export MODEL_NAME="CompVis/stable-diffusion-v1-4" export INSTANCE_DIR="path-to-instance-images" export CLASS_DIR="path-to-class-images" export OUTPUT_DIR="path-to-save-model" @@ -168,7 +168,7 @@ Pass the `--train_text_encoder` argument to the script to enable training `text_ ___Note: Training text encoder requires more memory, with this option the training won't fit on 16GB GPU. It needs at least 24GB VRAM.___ ```bash -export MODEL_NAME="runwayml/stable-diffusion-v1-5" +export MODEL_NAME="CompVis/stable-diffusion-v1-4" export INSTANCE_DIR="path-to-instance-images" export CLASS_DIR="path-to-class-images" export OUTPUT_DIR="path-to-save-model" diff --git a/examples/text_to_image/README.md b/examples/text_to_image/README.md index 61945d164c79..6aca642cda4a 100644 --- a/examples/text_to_image/README.md +++ b/examples/text_to_image/README.md @@ -25,7 +25,7 @@ accelerate config ### Pokemon example -You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. +You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). @@ -43,7 +43,7 @@ If you have already cloned the repo, then you won't need to go through these ste With `gradient_checkpointing` and `mixed_precision` it should be possible to fine tune the model on a single 24GB GPU. For higher `batch_size` and faster training it's better to use GPUs with >30GB memory. ```bash -export MODEL_NAME="runwayml/stable-diffusion-v1-5" +export MODEL_NAME="CompVis/stable-diffusion-v1-4" export dataset_name="lambdalabs/pokemon-blip-captions" accelerate launch train_text_to_image.py \ @@ -67,7 +67,7 @@ To run on your own training files prepare the dataset according to the format re If you wish to use custom loading logic, you should modify the script, we have left pointers for that in the training script. ```bash -export MODEL_NAME="runwayml/stable-diffusion-v1-5" +export MODEL_NAME="CompVis/stable-diffusion-v1-4" export TRAIN_DIR="path_to_your_dataset" accelerate launch train_text_to_image.py \ diff --git a/examples/textual_inversion/README.md b/examples/textual_inversion/README.md index 3a1e71c54b62..801c42c8a0f8 100644 --- a/examples/textual_inversion/README.md +++ b/examples/textual_inversion/README.md @@ -29,7 +29,7 @@ accelerate config ### Cat toy example -You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. +You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). From f48fc0ddb855e2c9055282b6db2440476e14725f Mon Sep 17 00:00:00 2001 From: apolinario Date: Fri, 21 Oct 2022 12:51:54 +0200 Subject: [PATCH 11/12] Docs changes --- examples/community/interpolate_stable_diffusion.py | 2 +- examples/community/lpw_stable_diffusion.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/examples/community/interpolate_stable_diffusion.py b/examples/community/interpolate_stable_diffusion.py index 1de925d562c2..97116bdc77b4 100644 --- a/examples/community/interpolate_stable_diffusion.py +++ b/examples/community/interpolate_stable_diffusion.py @@ -69,7 +69,7 @@ class StableDiffusionWalkPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. + Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ diff --git a/examples/community/lpw_stable_diffusion.py b/examples/community/lpw_stable_diffusion.py index 42fcc7330265..3d4ec23e3aea 100644 --- a/examples/community/lpw_stable_diffusion.py +++ b/examples/community/lpw_stable_diffusion.py @@ -389,7 +389,7 @@ class StableDiffusionLongPromptWeightingPipeline(DiffusionPipeline): [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. safety_checker ([`StableDiffusionSafetyChecker`]): Classification module that estimates whether generated images could be considered offensive or harmful. - Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. + Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. feature_extractor ([`CLIPFeatureExtractor`]): Model that extracts features from generated images to be used as inputs for the `safety_checker`. """ From a77496b750389dddde86e7ed35c2ed6320f863c1 Mon Sep 17 00:00:00 2001 From: Patrick von Platen Date: Mon, 24 Oct 2022 22:49:52 +0200 Subject: [PATCH 12/12] Apply suggestions from code review --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 2a294e2a30ae..1dc1d7838a5d 100644 --- a/README.md +++ b/README.md @@ -66,7 +66,7 @@ In order to get started, we recommend taking a look at two notebooks: ## Stable Diffusion is fully compatible with `diffusers`! -Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/), [LAION](https://laion.ai/) and [RunwayML](https://runwayml.com/). It's trained on 512x512 images from a subset of the [LAION-5B](https://laion.ai/blog/laion-5b/) database. This model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM. +Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/), [LAION](https://laion.ai/) and [RunwayML](https://runwayml.com/). It's trained on 512x512 images from a subset of the [LAION-5B](https://laion.ai/blog/laion-5b/) database. This model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 4GB VRAM. See the [model card](https://huggingface.co/CompVis/stable-diffusion) for more information. You need to accept the model license before downloading or using the Stable Diffusion weights. Please, visit the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license carefully and tick the checkbox if you agree. You have to be a registered user in ๐Ÿค— Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section](https://huggingface.co/docs/hub/security-tokens) of the documentation.