wav2text

A program that detects speech segments in audio files with inaSpeechSegmenter, recognizes speech with Cloud Speech-to-Text for each segment detected, and saves the speech description for each segment in csv format.

Setup

inaSpeechSegmenter

inaSpeechSegmenter is a CNN-based audio segmentation toolkit.

installation

$ virtualenv -p python3 inaSpeechSegEnv
$ source inaSpeechSegEnv/bin/activate
$ pip install tensorflow-gpu # for a GPU implementation
$ pip install tensorflow # for a CPU implementation
$ pip install inaSpeechSegmenter
$ pip install google-cloud-speech

Cloud Speech-to-Text

In advance, you need to create a service account in Cloud Console, set environment variables, and configure authentication.

export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/my-key.json"

ffmpeg commands

Video to audio file.

$ ffmpeg -i input.mp4 -ar 16000 -ac 1 -map 0:2 output.wav

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitignore		.gitignore
README.md		README.md
mp42Wav.sh		mp42Wav.sh
wav2text.py		wav2text.py
wav2text_all.sh		wav2text_all.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

wav2text

Setup

inaSpeechSegmenter

installation

Cloud Speech-to-Text

ffmpeg commands

CREDITS

About

Releases

Packages

Languages

ctxzz/wav2text

Folders and files

Latest commit

History

Repository files navigation

wav2text

Setup

inaSpeechSegmenter

installation

Cloud Speech-to-Text

ffmpeg commands

CREDITS

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages