Add Moonshine #34784

eustlb · 2024-11-18T16:40:19Z

What does this PR do?

This PR adds support for Moonshine to the Transformers library.

Moonshine builds on top of Whisper’s architecture to overcome some of its limitations, primarily the restriction to a fixed 30-second audio window.

Key improvements in Moonshine’s architecture:
1. It uses SwiGLU activation instead of GELU in the decoder layers.
2. Most importantly, it replaces absolute position embeddings with Rotary Position Embeddings (RoPE), enabling Moonshine to process audio inputs of any length—unlike Whisper, which is limited to fixed 30-second windows.

Who can review?

@ArthurZucker

TODO

update UsefulSensors model repos
run benchmarks

HuggingFaceDocBuilderDev · 2024-11-18T17:07:49Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

xenova

(just some notes in the meantime).

src/transformers/models/moonshine/convert_usefulsensors_to_hf.py

src/transformers/models/moonshine/modeling_moonshine.py

Co-authored-by: Joshua Lochner <admin@xenova.com>

src/transformers/models/auto/modeling_auto.py

src/transformers/models/auto/configuration_auto.py

Co-authored-by: Joshua Lochner <admin@xenova.com>

xenova · 2025-01-10T10:14:36Z

Amazing work @eustlb and team! 🤗

* config draft * full encoder forward * full decoder forward * fix sdpa and FA2 * fix sdpa and FA2 * moonshine model * moonshine model forward * fix attention with past_key_values * add MoonshineForConditionalGeneration * fix cache handling and causality for cross attention * no causal attention mask for the encoder * model addition (imports etc) * small nit * nits * Update src/transformers/models/moonshine/convert_usefulsensors_to_hf.py Co-authored-by: Joshua Lochner <admin@xenova.com> * add rope_theta * nits * model doc * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Joshua Lochner <admin@xenova.com> * imports * add MODEL_FOR_SPEECH_SEQ_2_SEQ_MAPPING_NAMES * updates modular * make * make fix-copies * ruff check examples fix * fix check_modular_conversion * nit * nits * nits * copied from -> imports * imports fix * integrate attention refacto * modular edge case * remove encoder * convolutions params in config * run modular_model_converter * make * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Joshua Lochner <admin@xenova.com> * MoonshineModelTest * correct typo * make style * integration tests * make * modular convert * name conversion update (up_proj -> fc1 etc) * update config * update MLP * update attention * update encoder layer * update decoder layer * update convolutions parameters * update encoder * remove INPUTS_DOCSTRING * update decoder * update conditional generation * update pretrained model * imports * modular converted * update doc * fix * typo * update doc * update license * update init * split config in file * two classes for MLP * attention from GLM * from GlmRotaryEmbedding * split MLP * apply arthur's review suggestions * apply arthur's review suggestions * apply arthur's review suggestions * auto feature extractor * convert modular * fix + make * convert modular * make * unsplit config * use correct checkpoint * wrap generate * update tests * typos * make * typo * update doc --------- Co-authored-by: Joshua Lochner <admin@xenova.com>

alexmil2019 · 2025-01-10T20:42:59Z

please check usefulsensors/moonshine#81. there is an error when using huggingface to load moonshine.

config draft

35434da

eustlb changed the title ~~Add Moonshine~~ [WIP] Add Moonshine Nov 18, 2024

eustlb added 4 commits December 2, 2024 19:00

full encoder forward

7e18038

full decoder forward

6517251

fix sdpa and FA2

b0efed1

fix sdpa and FA2

b4d18f9

xenova mentioned this pull request Dec 14, 2024

Add support for Moonshine ASR huggingface/transformers.js#1099

Merged

eustlb added 5 commits December 15, 2024 20:47

moonshine model

b3777e0

moonshine model forward

e313ab5

fix attention with past_key_values

7a6935a

add MoonshineForConditionalGeneration

8fda426

fix cache handling and causality for cross attention

d0ed917

eustlb force-pushed the add-moonshine branch from bac9c9f to d0ed917 Compare December 15, 2024 19:48

eustlb added 2 commits December 15, 2024 22:17

no causal attention mask for the encoder

461f210

model addition (imports etc)

22dbaae

eustlb force-pushed the add-moonshine branch from d4864c7 to 22dbaae Compare December 15, 2024 21:17

xenova reviewed Dec 15, 2024

View reviewed changes

src/transformers/models/moonshine/convert_usefulsensors_to_hf.py Outdated Show resolved Hide resolved

src/transformers/models/moonshine/convert_usefulsensors_to_hf.py Show resolved Hide resolved

src/transformers/models/moonshine/modeling_moonshine.py Outdated Show resolved Hide resolved

eustlb and others added 6 commits December 15, 2024 22:30

small nit

72ba8c4

nits

f548504

Update src/transformers/models/moonshine/convert_usefulsensors_to_hf.py

3d52b1e

Co-authored-by: Joshua Lochner <admin@xenova.com>

add rope_theta

8f82a40

nits

fc73b37

model doc

aedccf5

xenova reviewed Dec 16, 2024

View reviewed changes

src/transformers/models/auto/modeling_auto.py Show resolved Hide resolved

xenova reviewed Dec 16, 2024

View reviewed changes

src/transformers/models/auto/configuration_auto.py Outdated Show resolved Hide resolved

xenova mentioned this pull request Dec 16, 2024

ONNX improvements (-62% in full-precision model size, 2.7x faster load and execution, quantizations) usefulsensors/moonshine#73

Open

eustlb and others added 3 commits December 17, 2024 10:52

Update src/transformers/models/auto/configuration_auto.py

0954133

Co-authored-by: Joshua Lochner <admin@xenova.com>

imports

b1f0909

add MODEL_FOR_SPEECH_SEQ_2_SEQ_MAPPING_NAMES

f647a9f

eustlb and others added 26 commits January 9, 2025 16:40

update init

9bc7b35

split config in file

2bbb8ee

two classes for MLP

b42d7f6

attention from GLM

08272b9

from GlmRotaryEmbedding

61462db

split MLP

c866e58

apply arthur's review suggestions

b0183be

apply arthur's review suggestions

3aa4f8d

apply arthur's review suggestions

d64190c

auto feature extractor

0bfb6bc

convert modular

1268e13

fix + make

141a70b

Merge branch 'main' into add-moonshine

70f16dc

convert modular

1f3cc63

make

77ca19a

unsplit config

e6b19db

use correct checkpoint

7155fbd

wrap generate

15b2552

Merge branch 'main' into add-moonshine

89ceca8

update tests

fc4febc

typos

3e2a84b

make

9a8f91d

typo

834c364

update doc

b491d95

Merge branch 'main' into add-moonshine

d357fff

Merge branch 'main' into add-moonshine

2a6c59e

eustlb merged commit 5f087d1 into huggingface:main Jan 10, 2025
25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Moonshine #34784

Add Moonshine #34784

eustlb commented Nov 18, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Nov 18, 2024

xenova left a comment

xenova commented Jan 10, 2025

alexmil2019 commented Jan 10, 2025 •

edited

Loading

Add Moonshine #34784

Add Moonshine #34784

Conversation

eustlb commented Nov 18, 2024 • edited Loading

What does this PR do?

Who can review?

TODO

HuggingFaceDocBuilderDev commented Nov 18, 2024

xenova left a comment

Choose a reason for hiding this comment

xenova commented Jan 10, 2025

alexmil2019 commented Jan 10, 2025 • edited Loading

eustlb commented Nov 18, 2024 •

edited

Loading

alexmil2019 commented Jan 10, 2025 •

edited

Loading