Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG FIX] [Stable Audio Pipeline] Resolve torch.Tensor.new_zeros() TypeError in function prepare_latents caused by audio_vae_length #10306

Merged
merged 3 commits into from
Dec 20, 2024

Conversation

syntaxticsugr
Copy link
Contributor

@syntaxticsugr syntaxticsugr commented Dec 19, 2024

What does this PR do?

Parameter initial_audio_waveforms when passed torch.Tensor as described here raises TypeError: new_zeros(): argument 'size' failed to unpack the object at pos 3 with error "type must be tuple of ints,but got float"

Notebook to Reproduce the Error

In function prepare_latents:

audio_vae_length = self.transformer.config.sample_size * self.vae.hop_length

audio_vae_length evaluates to a float because self.transformer.config.sample_size returns a float

audio = initial_audio_waveforms.new_zeros(audio_shape)

torch.Tensor.new_zeros() accepts a single argument size – a list, tuple, or torch.Size of integers defining the shape of the output tensor. Source

audio_shape = (batch_size // num_waveforms_per_prompt, audio_channels, audio_vae_length)

But audio_shape is of type – (int, int, float) because audio_vae_length is a float

Proposed Fix is to wrap self.transformer.config.sample_size with int()

Notebook with the Applied Fix


Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline?
    Do not work on the main branch. 🤦‍♂️
  • Did you read our philosophy doc (important for complex PRs)?
  • Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

…ize' failed to unpack the object at pos 3 with error "type must be tuple of ints,but got float"

torch.Tensor.new_zeros() takes a single argument size (int...) – a list, tuple, or torch.Size of integers defining the shape of the output tensor.

in function prepare_latents:
audio_vae_length = self.transformer.config.sample_size * self.vae.hop_length
audio_shape = (batch_size // num_waveforms_per_prompt, audio_channels, audio_vae_length)
...
audio = initial_audio_waveforms.new_zeros(audio_shape)

audio_vae_length evaluates to float because self.transformer.config.sample_size returns a float
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@syntaxticsugr syntaxticsugr changed the title [BUG FIX] [Stable Audio Pipeline] initial_audio_waveforms raises TypeError: new_zeros() [BUG FIX] [Stable Audio Pipeline] Resolve TypeError in function prepare_latents caused by new_zeros() Dec 19, 2024
@syntaxticsugr syntaxticsugr changed the title [BUG FIX] [Stable Audio Pipeline] Resolve TypeError in function prepare_latents caused by new_zeros() [BUG FIX] [Stable Audio Pipeline] Resolve torch.Tensor.new_zeros() TypeError in function prepare_latents caused by audio_shape Dec 19, 2024
@syntaxticsugr syntaxticsugr changed the title [BUG FIX] [Stable Audio Pipeline] Resolve torch.Tensor.new_zeros() TypeError in function prepare_latents caused by audio_shape [BUG FIX] [Stable Audio Pipeline] Resolve torch.Tensor.new_zeros() TypeError in function prepare_latents caused by audio_vae_length Dec 19, 2024
Copy link
Collaborator

@hlky hlky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @syntaxticsugr!

@hlky hlky merged commit 9020086 into huggingface:main Dec 20, 2024
12 checks passed
danhipke pushed a commit to danhipke/diffusers that referenced this pull request Dec 20, 2024
…peError in function prepare_latents caused by audio_vae_length (huggingface#10306)

[BUG FIX] [Stable Audio Pipeline] TypeError: new_zeros(): argument 'size' failed to unpack the object at pos 3 with error "type must be tuple of ints,but got float"

torch.Tensor.new_zeros() takes a single argument size (int...) – a list, tuple, or torch.Size of integers defining the shape of the output tensor.

in function prepare_latents:
audio_vae_length = self.transformer.config.sample_size * self.vae.hop_length
audio_shape = (batch_size // num_waveforms_per_prompt, audio_channels, audio_vae_length)
...
audio = initial_audio_waveforms.new_zeros(audio_shape)

audio_vae_length evaluates to float because self.transformer.config.sample_size returns a float

Co-authored-by: hlky <hlky@hlky.ac>
Foundsheep pushed a commit to Foundsheep/diffusers that referenced this pull request Dec 23, 2024
…peError in function prepare_latents caused by audio_vae_length (huggingface#10306)

[BUG FIX] [Stable Audio Pipeline] TypeError: new_zeros(): argument 'size' failed to unpack the object at pos 3 with error "type must be tuple of ints,but got float"

torch.Tensor.new_zeros() takes a single argument size (int...) – a list, tuple, or torch.Size of integers defining the shape of the output tensor.

in function prepare_latents:
audio_vae_length = self.transformer.config.sample_size * self.vae.hop_length
audio_shape = (batch_size // num_waveforms_per_prompt, audio_channels, audio_vae_length)
...
audio = initial_audio_waveforms.new_zeros(audio_shape)

audio_vae_length evaluates to float because self.transformer.config.sample_size returns a float

Co-authored-by: hlky <hlky@hlky.ac>
sayakpaul pushed a commit that referenced this pull request Dec 23, 2024
…peError in function prepare_latents caused by audio_vae_length (#10306)

[BUG FIX] [Stable Audio Pipeline] TypeError: new_zeros(): argument 'size' failed to unpack the object at pos 3 with error "type must be tuple of ints,but got float"

torch.Tensor.new_zeros() takes a single argument size (int...) – a list, tuple, or torch.Size of integers defining the shape of the output tensor.

in function prepare_latents:
audio_vae_length = self.transformer.config.sample_size * self.vae.hop_length
audio_shape = (batch_size // num_waveforms_per_prompt, audio_channels, audio_vae_length)
...
audio = initial_audio_waveforms.new_zeros(audio_shape)

audio_vae_length evaluates to float because self.transformer.config.sample_size returns a float

Co-authored-by: hlky <hlky@hlky.ac>
DN6 added a commit that referenced this pull request Jan 10, 2025
…ve load performance on network mounts (#10305)

* Add no_mmap arg.

* Fix arg parsing.

* Update another method to force no mmap.

* logging

* logging2

* propagate no_mmap

* logging3

* propagate no_mmap

* logging4

* fix open call

* clean up logging

* cleanup

* fix missing arg

* update logging and comments

* Rename to disable_mmap and update other references.

* [Docs] Update ltx_video.md to remove generator from `from_pretrained()` (#10316)

Update ltx_video.md to remove generator from `from_pretrained()`

* docs: fix a mistake in docstring (#10319)

Update pipeline_hunyuan_video.py

docs: fix a mistake

* [BUG FIX] [Stable Audio Pipeline] Resolve torch.Tensor.new_zeros() TypeError in function prepare_latents caused by audio_vae_length (#10306)

[BUG FIX] [Stable Audio Pipeline] TypeError: new_zeros(): argument 'size' failed to unpack the object at pos 3 with error "type must be tuple of ints,but got float"

torch.Tensor.new_zeros() takes a single argument size (int...) – a list, tuple, or torch.Size of integers defining the shape of the output tensor.

in function prepare_latents:
audio_vae_length = self.transformer.config.sample_size * self.vae.hop_length
audio_shape = (batch_size // num_waveforms_per_prompt, audio_channels, audio_vae_length)
...
audio = initial_audio_waveforms.new_zeros(audio_shape)

audio_vae_length evaluates to float because self.transformer.config.sample_size returns a float

Co-authored-by: hlky <hlky@hlky.ac>

* [docs] Fix quantization links (#10323)

Update overview.md

* [Sana]add 2K related model for Sana (#10322)

add 2K related model for Sana

* Update src/diffusers/loaders/single_file_model.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Update src/diffusers/loaders/single_file.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* make style

---------

Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Leojc <liao_junchao@outlook.com>
Co-authored-by: Aditya Raj <syntaxticsugr@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Junsong Chen <cjs1020440147@icloud.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants