Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Break up datasets.py #141

Merged
merged 6 commits into from
Oct 28, 2024
Merged

Break up datasets.py #141

merged 6 commits into from
Oct 28, 2024

Conversation

juberti
Copy link
Contributor

@juberti juberti commented Oct 19, 2024

This splits out types.py and registry.py to move the list of pre-defined datasets to its own file and avoid circular refs.

An all import is used to minimize changes to surrounding code.

This splits out types.py and registry.py to move the list of pre-defined datasets to its own file and avoid circular refs.

An __all__ import is used to minimize changes to surrounding code.
ultravox/data/types.py Outdated Show resolved Hide resolved
ultravox/data/registry.py Outdated Show resolved Hide resolved
Copy link
Contributor

@farzadab farzadab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

ultravox/tools/push_to_hub.py Show resolved Hide resolved
ultravox/data/registry.py Show resolved Hide resolved
@juberti juberti enabled auto-merge (squash) October 28, 2024 06:19
@juberti juberti merged commit 487e939 into main Oct 28, 2024
1 check passed
akshat0311 pushed a commit to jiviai/audio-llm that referenced this pull request Jan 30, 2025
* Break up datasets.py

This splits out types.py and registry.py to move the list of pre-defined datasets to its own file and avoid circular refs.

An __all__ import is used to minimize changes to surrounding code.

* sr

* cr

* merge

* restore typing
zqhuang211 pushed a commit that referenced this pull request Feb 12, 2025
* Fix typo in README.md (#128)
* [bugfix] Missing enable_fsdp in 70b config (#132)
* Update load warnings (#126)
* Generic datasets with inheritance (#135)
* Switch InterleaveDataset to use weights (e.g., 2.0, 0.5, etc) (#140)
* Break up datasets.py (#141)
* Update registry with more languages commonvoice (#143)
* Split dataset definitions into individual files  (#145)
* Add whisper masking (#146)
* Defining block size in UltravoxConfig, and solving assertions (#157)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants