HSD: A hierarchical singing annotation dataset

This repository provides a singing annotation dataset that records vocal information in pop songs. It mainly labels pitch, duration, lyric, onset, and offset of each musical note. Meanwhile, all the information is recorded in a hierarchical structure.

Annotations

Two kind of annotations are offered: enhanced LRC and MIDI. The enhanced LRC annotations are recommended because the singing information is recorded in a hierarchical structure.

enhanced LRC

The enhanced LRC files are in the "enhanced_lrc" folder. Each line in an enhanced LRC file records the vocal information of a music phrase. Each line is in the format:

[phrase time tag]<onset time tag>lyric pitch duration{offset time tag}<onset time tag>lyric pitch duration{offset time tag}...<onset time tag>lyric pitch duration{offset time tag}

"read_enhanced_lyric.py" can be used to read the annotations.

MIDI

The annotation MIDIs are also provided in the "midi" folder.

Label Initialization

The midi labels are initialized by music notation and LRC files. The corrected musical notation and LRC files are in folder "notation" and "lrc".

"initialize_label.py" can be used to create coarse labels.

Manual Label Calibration

All the labels are calibrated via a manually process. Annotators correct the time tags in the LRC files to calibrate the whole song.

Meanwhile, we provide a method that directly converting the enhanced LRC files to MIDI files. "elrc2midi.py" can be used to execute this process.

Raw Audio

The youtube links of all the raw audio are recorded in "youtubeLinks.txt".

"download.py" can be used to get the raw audio.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HSD: A hierarchical singing annotation dataset

Annotations

enhanced LRC

MIDI

Label Initialization

Manual Label Calibration

Raw Audio

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
enhanced_lrc		enhanced_lrc
lrc		lrc
midi		midi
notation		notation
README.md		README.md
download.py		download.py
elrc2midi.py		elrc2midi.py
initialize_label.py		initialize_label.py
make_midi.py		make_midi.py
preprocess.py		preprocess.py
read_enhanced_lyric.py		read_enhanced_lyric.py
utils.py		utils.py
youtubeLinks.txt		youtubeLinks.txt

hirabarahyt/HSD-Dataset

Folders and files

Latest commit

History

Repository files navigation

HSD: A hierarchical singing annotation dataset

Annotations

enhanced LRC

MIDI

Label Initialization

Manual Label Calibration

Raw Audio

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages