-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1-5 docs updates #921
v1-5 docs updates #921
Conversation
Additionally add FLAX so the model card can be slimmer and point to this page
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed a couple of references to v1-4
(in the text, not the links) :)
@@ -64,44 +64,54 @@ In order to get started, we recommend taking a look at two notebooks: | |||
- The [Training a diffusers model](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb) [](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb) notebook summarizes diffusion models training methods. This notebook takes a step-by-step approach to training your | |||
diffusion models on an image dataset, with explanatory graphics. | |||
|
|||
## **New** Stable Diffusion is now fully compatible with `diffusers`! | |||
## Stable Diffusion is fully compatible with `diffusers`! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
README.md
Outdated
|
||
Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/) and [LAION](https://laion.ai/). It's trained on 512x512 images from a subset of the [LAION-5B](https://laion.ai/blog/laion-5b/) database. This model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM. | ||
Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/), [LAION](https://laion.ai/) and [RunwayML](https://runwayml.com/). It's trained on 512x512 images from a subset of the [LAION-5B](https://laion.ai/blog/laion-5b/) database. This model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
~4 now I think, with attention slicing, but probably better to keep this simple.
examples/text_to_image/README.md
Outdated
@@ -25,7 +25,7 @@ accelerate config | |||
|
|||
### Pokemon example | |||
|
|||
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. | |||
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. | |
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-5`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. |
examples/textual_inversion/README.md
Outdated
@@ -29,7 +29,7 @@ accelerate config | |||
|
|||
### Cat toy example | |||
|
|||
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. | |||
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. | |
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-5`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Thanks a lot for taking this one @apolinario :-)
prompt_ids = shard(prompt_ids) | ||
|
||
images = pipeline(prompt_ids, params, prng_seed, num_inference_steps, jit=True).images | ||
images = pipeline.numpy_to_pil(np.asarray(images.reshape((num_samples,) + images.shape[-3:]))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this PR: Think we should move the reshape functionality in the numpy_to_pil
function (maybe we could open an issue for this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's wait with this one until we're sure it's better than v1-4 and naming is confirmed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks a lot! Just left couple of comments.
examples/dreambooth/README.md
Outdated
@@ -22,7 +22,7 @@ accelerate config | |||
|
|||
### Dog toy example | |||
|
|||
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. | |||
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we want to change the model in examples. Think this is up to the user. The commands have specific hprams tested with specific checkpoint, and are supposed to to work with that checkpoint out of the box, in case some users want to just follow the example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO would make sense to have the latest model in the examples. What kind of hparams are tuned to v1-4? I remember when we were preparing SD for release with diffusers we were working on v1-3 and v1-4 dropped quite late, did we swap a lot of stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on Suraj here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suraj "tuned" the long blog post on v1-4, so I think let's leave it there for now
examples/text_to_image/README.md
Outdated
@@ -25,7 +25,7 @@ accelerate config | |||
|
|||
### Pokemon example | |||
|
|||
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. | |||
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above, let's not change the model in the examples, unless we run the experiments with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
examples/textual_inversion/README.md
Outdated
@@ -29,7 +29,7 @@ accelerate config | |||
|
|||
### Cat toy example | |||
|
|||
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. | |||
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above.
@@ -69,7 +69,7 @@ class StableDiffusionWalkPipeline(DiffusionPipeline): | |||
[`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. | |||
safety_checker ([`StableDiffusionSafetyChecker`]): | |||
Classification module that estimates whether generated images could be considered offensive or harmful. | |||
Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. | |||
Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's maybe revert this as well as the community pipeline author should decide here
@@ -389,7 +389,7 @@ class StableDiffusionLongPromptWeightingPipeline(DiffusionPipeline): | |||
[`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. | |||
safety_checker ([`StableDiffusionSafetyChecker`]): | |||
Classification module that estimates whether generated images could be considered offensive or harmful. | |||
Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. | |||
Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's maybe revert this as well as the community pipeline author should decide here
@@ -39,7 +39,7 @@ clip_model = CLIPModel.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K", | |||
|
|||
|
|||
guided_pipeline = DiffusionPipeline.from_pretrained( | |||
"CompVis/stable-diffusion-v1-4", | |||
"runwayml/stable-diffusion-v1-5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@patil-suraj do you want to change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can change this, it should be fine here.
examples/community/README.md
Outdated
@@ -202,7 +202,7 @@ from diffusers import DiffusionPipeline | |||
import torch | |||
|
|||
pipe = DiffusionPipeline.from_pretrained( | |||
'CompVis/stable-diffusion-v1-4', | |||
'runwayml/stable-diffusion-v1-5', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's maybe revert this as well as the community pipeline author should decide here
@@ -139,7 +139,7 @@ def download_image(url): | |||
response = requests.get(url) | |||
return PIL.Image.open(BytesIO(response.content)).convert("RGB") | |||
|
|||
pipe = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", custom_pipeline="stable_diffusion_mega", torch_dtype=torch.float16, revision="fp16") | |||
pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", custom_pipeline="stable_diffusion_mega", torch_dtype=torch.float16, revision="fp16") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok for me!
examples/community/README.md
Outdated
@@ -97,7 +97,7 @@ from diffusers import DiffusionPipeline | |||
import torch | |||
|
|||
pipe = DiffusionPipeline.from_pretrained( | |||
"CompVis/stable-diffusion-v1-4", | |||
"runwayml/stable-diffusion-v1-5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's maybe revert this as well as the community pipeline author should decide here
@@ -58,7 +58,7 @@ feature_extractor = CLIPFeatureExtractor.from_pretrained(clip_model_id) | |||
clip_model = CLIPModel.from_pretrained(clip_model_id) | |||
|
|||
pipeline = DiffusionPipeline.from_pretrained( | |||
"CompVis/stable-diffusion-v1-4", | |||
"runwayml/stable-diffusion-v1-5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@patil-suraj should we change or leave this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay to chnage this!
@@ -64,7 +64,7 @@ accelerate config | |||
|
|||
### Cat toy example | |||
|
|||
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. | |||
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox if you agree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's revert this until we've tested it with v1-5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
… into v1-5-updates
Thank you so much for reviewing! I think I acted on all the reversals discussed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @apolinario !
* Update README.md Additionally add FLAX so the model card can be slimmer and point to this page * Find and replace all * v-1-5 -> v1-5 * revert test changes * Update README.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/quicktour.mdx Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update README.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/quicktour.mdx Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update README.md Co-authored-by: Suraj Patil <surajp815@gmail.com> * Revert certain references to v1-5 * Docs changes * Apply suggestions from code review Co-authored-by: apolinario <joaopaulo.passos+multimodal@gmail.com> Co-authored-by: anton-l <anton@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com>
No description provided.