Skip to content
/ wav2text Public

A program that segments speech data with inaSpeechSegmenter and transcribes it with Cloud Speech to Text

Notifications You must be signed in to change notification settings

ctxzz/wav2text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

wav2text

A program that detects speech segments in audio files with inaSpeechSegmenter, recognizes speech with Cloud Speech-to-Text for each segment detected, and saves the speech description for each segment in csv format.

Setup

inaSpeechSegmenter

inaSpeechSegmenter is a CNN-based audio segmentation toolkit.

installation

$ virtualenv -p python3 inaSpeechSegEnv
$ source inaSpeechSegEnv/bin/activate
$ pip install tensorflow-gpu # for a GPU implementation
$ pip install tensorflow # for a CPU implementation
$ pip install inaSpeechSegmenter
$ pip install google-cloud-speech

Cloud Speech-to-Text

In advance, you need to create a service account in Cloud Console, set environment variables, and configure authentication.

export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/my-key.json"

ffmpeg commands

Video to audio file.

$ ffmpeg -i input.mp4 -ar 16000 -ac 1 -map 0:2 output.wav

CREDITS

About

A program that segments speech data with inaSpeechSegmenter and transcribes it with Cloud Speech to Text

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published