Skip to content

🎼 DiffMusic: A Zero-shot Diffusion-Based Framework for Music Inverse Problem

License

Notifications You must be signed in to change notification settings

jwliao1209/DiffMusic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DiffMusic: A Zero-shot Diffusion-Based Framework for Music Inverse Problem

This repository contains the implementation for final project of the CommE5070 Deep Learning for Music Analysis and Generation course, Fall 2024, at National Taiwan University. For a detailed report, please refer to this slides.

Setup

To set up the virtual environment and install the required packages, use the following commands:

virtualenv --python=python3.10 diffmusic
source diffmusic/bin/activate
pip install -r requirements.txt

Download CLAP pretrained weight

mkdir CLAP_weights
cd CLAP_weights

wget https://huggingface.co/microsoft/msclap/resolve/main/CLAP_weights_2022.pth

wget https://huggingface.co/microsoft/msclap/resolve/main/CLAP_weights_2023.pth

cd ..

Data Preparation

To download the dataset, run the following script:

bash scripts/download_data.sh

Generating Music for Inverse Problems

To address an inverse problem, you can use the following command:

python run.py \
    --task <Inverse Problem Task: {music_generation, music_inpainting, phase_retrieval, super_resolution, dereverberation, style_guidance}> \
    --scheduler <Sampling Scheduler: ddim, dps, mpgd, dsg, diffmusic> \
    --config_path <Path to Model Configuration> \
    --prompt ""

Available Inverse Problem Tasks

The following tasks can be specified with the --task option:

  • music_generation
  • music_inpainting
  • phase_retrieval
  • super_resolution
  • dereverberation
  • style_guidance

Available Scheduler

The following tasks can be specified with the --scheduler option:

  • ddim
  • dps
  • mpgd
  • dsg
  • diffmusic

Available Model Configurations

Specify the model configuration file with the --config_path option:

  • configs/audioldm2.yaml
  • configs/musicldm.yaml

Example Command

To perform music inpainting with a specific configuration:

python run.py \
    --task "music_inpainting" \
    --config_path "configs/musicldm.yaml" \
    --prompt ""

To perform style guidance with a specific configuration:

python run.py \
    --task "style_guidance" \
    --config_path "configs/audioldm2.yaml" \
    --prompt "A female reporter is singing"

Environment

We implemented the code on an environment running Ubuntu 22.04.1, utilizing a 12th Generation Intel(R) Core(TM) i7-12700 CPU, along with a single NVIDIA GeForce RTX 4090 GPU equipped with 24 GB of dedicated memory.

Citation

If you use this code, please cite the following:

@misc{liao2024_diffmusic,
    title  = {DiffMusic: A Unified Diffusion-Based Framework for Music Inverse Problem},
    author = {Jia-Wei Liao, Pin-Chi Pan, and Sheng-Ping Yang},
    url    = {https://github.com/jwliao1209/DiffMusic},
    year   = {2024}
}

About

🎼 DiffMusic: A Zero-shot Diffusion-Based Framework for Music Inverse Problem

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published