Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training takes up too much disk space #10

Open
JRMeyer opened this issue Dec 24, 2017 · 0 comments
Open

Training takes up too much disk space #10

JRMeyer opened this issue Dec 24, 2017 · 0 comments

Comments

@JRMeyer
Copy link
Contributor

JRMeyer commented Dec 24, 2017

Hi Oliver,

When I do training on a 3.5 hour corpus, I run out of disk space (12Gigs) very fast:

OSSIAN$ du -h train/chv/speakers/news/naive_01_nn -d 1
26M     train/chv/speakers/news/naive_01_nn/time_lab
888K    train/chv/speakers/news/naive_01_nn/dnn_training_ACOUST
9.7G    train/chv/speakers/news/naive_01_nn/cmp
129M    train/chv/speakers/news/naive_01_nn/lab_dur
8.7M    train/chv/speakers/news/naive_01_nn/align_lab
8.5M    train/chv/speakers/news/naive_01_nn/dur
64M     train/chv/speakers/news/naive_01_nn/utt
253M    train/chv/speakers/news/naive_01_nn/processors
12M     train/chv/speakers/news/naive_01_nn/align_log
629M    train/chv/speakers/news/naive_01_nn/lab_dnn
11G     train/chv/speakers/news/naive_01_nn

I see most of the space is taken up under the cmp directory:

OSSIAN$ du -h train/chv/speakers/news/naive_01_nn/cmp -d 1
4.0K    train/chv/speakers/news/naive_01_nn/cmp/nn_mgc_lf0_vuv_bap_199
4.0K    train/chv/speakers/news/naive_01_nn/cmp/nn_norm_mgc_lf0_vuv_bap_199
4.4G    train/chv/speakers/news/naive_01_nn/cmp/binary_label_502
2.9G    train/chv/speakers/news/naive_01_nn/cmp/nn_no_silence_lab_502
4.0K    train/chv/speakers/news/naive_01_nn/cmp/nn_no_silence_lab_norm_502
9.7G    train/chv/speakers/news/naive_01_nn/cmp

So the binary_label_502 and nn_no_silence_lab_502 take up the most space under cmp.

Any work arounds?

I'm running Ossian on AWS with 16G disk space, and since the OS takes up about 4G, training crashes after I train the frontend and move on to Merlin.

Specifically, I crash after this command:

python ./tools/merlin/src/run_merlin.py /home/ubuntu/Ossian/train//chv/speakers/news/naive_01_nn/processors/acoustic_predictor/config.cfg

Thanks!

-josh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant