Bach Violin Dataset

The Bach Violin Dataset is a collection of high-quality public recordings of Bach's sonatas and partitas for solo violin (BWV 1001–1006). The dataset consists of 6.5 hours of professional recordings from 17 violinists recorded in various recording setups. It also provides the reference scores and estimated alignments between the recordings and scores. The dataset can be downloaded from Zenodo or GitHub.

All recordings are collected from the web. Their source URLs can be found in audio.csv. They come in either MP3 or OPUS format, depending on their sources.
Each folder in the audio directory corresponds to a collection that contains recordings with similar recording setups except the misc folder. (Due to copyright concern, audio files in the collections shunske-sato and young-talents need to be downloaded from YouTube.)
The corresponding reference score for each recording can be found in info.csv. They come in MusicXML format, which can be opened with MuseScore, music21 and MusPy.

Below is the file organization of the dataset.

├─ README                                  README file
├─ audio.csv                               Metadata of the audio files
├─ info.csv                                Metadata of the processed files
├─ audio
│  ├─ emil-telmanyi
│  │  ├─ emil-telmanyi_bwv1001.mp3         Recording
│  │  └─ ...
│  └─ ...
├─ scores
│  ├─ bwv1001
│  │  ├─ bwv1001.mxl                       Reference score (whole piece)
│  │  ├─ bwv1001_mov1.mxl                  Reference score (single movement)
│  │  └─ ...
│  └─ ...
├─ notes
│  ├─ emil-telmanyi
│  │  ├─ emil-telmanyi_bwv1001_mov1.csv    Score as a note sequence
│  │  └─ ...
│  └─ ...
├─ alignments
│  ├─ emil-telmanyi
│  │  ├─ emil-telmanyi_bwv1001_mov1.csv    Estimated alignment
│  │  └─ ...
│  └─ ...
└─ tempos
   ├─ emil-telmanyi
   │  ├─ emil-telmanyi_bwv1001_mov1.txt    Average tempo
   │  └─ ...
   └─ ...

Source code

The source code for creating this dataset can be found in the src directory. More details can be found in its README. An introduction of the alignment process is available here.

License

All audio files in this dataset are public recordings collected from various sources. The license for each audio file can be found in its parent directory. All derived alignments retain the same licenses as their corresponding audio files. All reference scores are in public domain. All the code is licensed under MIT.

Citation

Please cite the following paper if you use the dataset or code provided.

Hao-Wen Dong, Cong Zhou, Taylor Berg-Kirkpatrick, and Julian McAuley, "Deep Performer: Score-to-Audio Music Performance Synthesis," Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022

Paper

Deep Performer: Score-to-Audio Music Performance Synthesis
Hao-Wen Dong, Cong Zhou, Taylor Berg-Kirkpatrick, and Julian McAuley
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022
[homepage] [paper] [reviews]

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github		.github
bach-violin		bach-violin
docs		docs
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bach Violin Dataset

Contents

Source code

License

Citation

Paper

About

Releases 1

Sponsor this project

Languages

salu133445/bach-violin-dataset

Folders and files

Latest commit

History

Repository files navigation

Bach Violin Dataset

Contents

Source code

License

Citation

Paper

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Sponsor this project

Languages