This repository contains the implementation for final project of the CommE5070 Deep Learning for Music Analysis and Generation course, Fall 2024, at National Taiwan University. For a detailed report, please refer to this slides.
To set up the virtual environment and install the required packages, use the following commands:
virtualenv --python=python3.10 diffmusic
source diffmusic/bin/activate
pip install -r requirements.txt
mkdir CLAP_weights
cd CLAP_weights
wget https://huggingface.co/microsoft/msclap/resolve/main/CLAP_weights_2022.pth
wget https://huggingface.co/microsoft/msclap/resolve/main/CLAP_weights_2023.pth
cd ..
To download the dataset, run the following script:
bash scripts/download_data.sh
To address an inverse problem, you can use the following command:
python run.py \
--task <Inverse Problem Task: {music_generation, music_inpainting, phase_retrieval, super_resolution, dereverberation, style_guidance}> \
--scheduler <Sampling Scheduler: ddim, dps, mpgd, dsg, diffmusic> \
--config_path <Path to Model Configuration> \
--prompt ""
The following tasks can be specified with the --task
option:
music_generation
music_inpainting
phase_retrieval
super_resolution
dereverberation
style_guidance
The following tasks can be specified with the --scheduler
option:
ddim
dps
mpgd
dsg
diffmusic
Specify the model configuration file with the --config_path
option:
configs/audioldm2.yaml
configs/musicldm.yaml
To perform music inpainting with a specific configuration:
python run.py \
--task "music_inpainting" \
--config_path "configs/musicldm.yaml" \
--prompt ""
To perform style guidance with a specific configuration:
python run.py \
--task "style_guidance" \
--config_path "configs/audioldm2.yaml" \
--prompt "A female reporter is singing"
We implemented the code on an environment running Ubuntu 22.04.1, utilizing a 12th Generation Intel(R) Core(TM) i7-12700 CPU, along with a single NVIDIA GeForce RTX 4090 GPU equipped with 24 GB of dedicated memory.
If you use this code, please cite the following:
@misc{liao2024_diffmusic,
title = {DiffMusic: A Unified Diffusion-Based Framework for Music Inverse Problem},
author = {Jia-Wei Liao, Pin-Chi Pan, and Sheng-Ping Yang},
url = {https://github.com/jwliao1209/DiffMusic},
year = {2024}
}