Make EncodecModel.decode ONNX exportable #29913

fxmarty · 2024-03-27T16:28:10Z

As per title. This is needed for the ONNX export of Musicgen e.g. for transformers.js

This removes an important warning:

/home/fxmarty/hf_internship/transformers/src/transformers/models/encodec/modeling_encodec.py:121: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  ideal_length = (math.ceil(n_frames) - 1) * stride + (kernel_size - padding_total)

where a padding length was previously hard-coded and applied here

transformers/src/transformers/models/encodec/modeling_encodec.py

Lines 149 to 160 in f01e160

    
           extra_padding = self._get_extra_padding_for_conv1d(hidden_states, kernel_size, stride, padding_total) 
        
           if self.causal: 
        
               # Left padding for causal 
        
               hidden_states = self._pad1d(hidden_states, (padding_total, extra_padding), mode=self.pad_mode) 
        
           else: 
        
               # Asymmetric padding required for odd strides 
        
               padding_right = padding_total // 2 
        
               padding_left = padding_total - padding_right 
        
               hidden_states = self._pad1d( 
        
                   hidden_states, (padding_left, padding_right + extra_padding), mode=self.pad_mode 
        
               )

fxmarty · 2024-03-27T16:36:29Z

src/transformers/models/encodec/modeling_encodec.py

+        n_frames = (length - self.kernel_size + self.padding_total) / self.stride + 1
+        ideal_length = ((torch.ceil(n_frames).to(torch.int64) - 1) * self.stride + (self.kernel_size - self.padding_total))


Essentially, we need these ops to be on tensors and not on python types (hence the registration of buffers).

The .to(torch.int64) is added because the produced ONNX model is wrong otherwise (try to concat the float ideal_length - length with padding_total, which is illegal

HuggingFaceDocBuilderDev · 2024-03-27T16:56:37Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

xenova

Can confirm this fixes the issue with the ONNX export (tested with transformers.js)!

ylacombe

LGTM!

ArthurZucker

LGTM

fxmarty added 2 commits March 27, 2024 17:25

fix encodec onnx export for musicgen

90297a4

simplification

1ec46d7

fxmarty commented Mar 27, 2024

View reviewed changes

fxmarty requested review from sanchit-gandhi and ylacombe March 27, 2024 16:37

fxmarty mentioned this pull request Mar 27, 2024

Musicgen ONNX export (text-conditional only) huggingface/optimum#1779

Merged

fxmarty added 2 commits March 27, 2024 18:14

fix quality

63a2641

better style

b81c254

xenova approved these changes Apr 2, 2024

View reviewed changes

ylacombe approved these changes Apr 3, 2024

View reviewed changes

fxmarty merged commit 81642d2 into huggingface:main Apr 3, 2024
18 checks passed

ArthurZucker reviewed Apr 5, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make EncodecModel.decode ONNX exportable #29913

Make EncodecModel.decode ONNX exportable #29913

fxmarty commented Mar 27, 2024

fxmarty Mar 27, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 27, 2024

xenova left a comment •

edited

Loading

ylacombe left a comment

ArthurZucker left a comment

	extra_padding = self._get_extra_padding_for_conv1d(hidden_states, kernel_size, stride, padding_total)

	if self.causal:
	# Left padding for causal
	hidden_states = self._pad1d(hidden_states, (padding_total, extra_padding), mode=self.pad_mode)
	else:
	# Asymmetric padding required for odd strides
	padding_right = padding_total // 2
	padding_left = padding_total - padding_right
	hidden_states = self._pad1d(
	hidden_states, (padding_left, padding_right + extra_padding), mode=self.pad_mode
	)

		n_frames = (length - self.kernel_size + self.padding_total) / self.stride + 1
		ideal_length = ((torch.ceil(n_frames).to(torch.int64) - 1) * self.stride + (self.kernel_size - self.padding_total))

Make EncodecModel.decode ONNX exportable #29913

Make EncodecModel.decode ONNX exportable #29913

Conversation

fxmarty commented Mar 27, 2024

fxmarty Mar 27, 2024 • edited Loading

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Mar 27, 2024

xenova left a comment • edited Loading

Choose a reason for hiding this comment

ylacombe left a comment

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

fxmarty Mar 27, 2024 •

edited

Loading

xenova left a comment •

edited

Loading