wav2text

A program that detects speech segments in audio files with inaSpeechSegmenter, recognizes speech with Cloud Speech-to-Text for each segment detected, and saves the speech description for each segment in csv format.

Setup

inaSpeechSegmenter

inaSpeechSegmenter is a CNN-based audio segmentation toolkit.

installation

$ virtualenv -p python3 inaSpeechSegEnv
$ source inaSpeechSegEnv/bin/activate
$ pip install tensorflow-gpu # for a GPU implementation
$ pip install tensorflow # for a CPU implementation
$ pip install inaSpeechSegmenter
$ pip install google-cloud-speech

Cloud Speech-to-Text

In advance, you need to create a service account in Cloud Console, set environment variables, and configure authentication.

export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/my-key.json"

ffmpeg commands

Video to audio file.

$ ffmpeg -i input.mp4 -ar 16000 -ac 1 -map 0:2 output.wav

CREDITS

inaSpeechSegmenter
python-speech

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

wav2text

Setup

inaSpeechSegmenter

installation

Cloud Speech-to-Text

ffmpeg commands

CREDITS

Files

README.md

Latest commit

History

README.md

File metadata and controls

wav2text

Setup

inaSpeechSegmenter

installation

Cloud Speech-to-Text

ffmpeg commands

CREDITS