Skip to content

Louis0324/DDSP-Articulatory-Vocoder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DDSP Articulatory Vocoder

Official implementation of the paper:

Fast, High-Quality and Parameter-Efficient Articulatory Synthesis Using Differentiable DSP (SLT 2024)

[paper][demo]

DDSP code is based on https://github.com/sweetcocoa/ddsp-pytorch and https://intro2ddsp.github.io/intro.html

Environment Setup

  • Install conda
  • Run conda env create -f environment.yml to create conda env

Dataset prep

  1. Download paired EMA and speech data, such as HPRC.
  2. Resample wav to be 16 kHz, and EMA data to be 200 Hz.
  3. Use data_prep/batch_invert.ipynb or other methods to extract pitch and loudness from waveform at 200 Hz.
  4. Use data_split.ipynb to generate jsons that define the test/val/train splits

Training

  1. Edit the config yaml, which defines hyperparameters, training parameters, and file directories.
  2. Run python vocoder/main.py --config yamls/config.yaml from the source directory to train.

Citation

If you find this repository useful, please cite our work with the following BibTex entries:

@misc{louis24ddsp,
    title={Fast, High-Quality and Parameter-Efficient Articulatory Synthesis using Differentiable DSP},
    author={Yisi Liu, Bohan Yu, Drake Lin, Peter Wu, Cheol Jun Cho, Gopala Krishna Anumanchipalli},
    year={2024},
    eprint={2409.02451},
    archivePrefix={arXiv},
    primaryClass={eess.AS}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published