Skip to content

Commit

Permalink
Create Audio_Pipeline.md
Browse files Browse the repository at this point in the history
  • Loading branch information
rmusser01 committed Dec 3, 2024
1 parent 9fb6eba commit 2eff367
Showing 1 changed file with 52 additions and 0 deletions.
52 changes: 52 additions & 0 deletions Docs/Design/Audio_Pipeline.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Audio Pipelines

## Introduction
This page serves as documentation regarding the audio processing pipelines within tldw and provides context/justification for the decisions made within them.


## Audio Pipelines


### Faster_Whisper


### Audio Language Models



### Link Dump:
https://github.com/kadirnar/whisper-plus

Papers
https://arxiv.org/pdf/2212.04356

WER:
https://pubs.aip.org/asa/jel/article/4/2/025206/3267247/Evaluating-OpenAI-s-Whisper-ASR-Performance

Transcription:
https://github.com/AugmendTech/treeseg
https://www.arxiv.org/abs/2407.12028
https://huggingface.co/spaces/aadnk/faster-whisper-webui
https://huggingface.co/spaces/zhang082799/openai-whisper-large-v3-turbo
https://petewarden.com/2024/10/21/introducing-moonshine-the-new-state-of-the-art-for-speech-to-text/
https://github.com/usefulsensors/moonshine?tab=readme-ov-file
https://github.com/revdotcom/reverb-self-hosted/tree/main/reverb-self-hosted-api
https://github.com/SpeechColab/GigaSpeech
https://huggingface.co/nvidia/canary-1b
https://developer.nvidia.com/blog/accelerating-leaderboard-topping-asr-models-10x-with-nvidia-nemo/
https://huggingface.co/spaces/hf-audio/open_asr_leaderboard
https://www.futurebeeai.com/blog/breaking-down-word-error-rate
https://github.com/MahmoudAshraf97/whisper-diarization/
https://github.com/transcriptionstream/transcriptionstream
https://github.com/SYSTRAN/faster-whisper
https://whisperapi.com/word-error-rate-wer
https://github.com/oliverguhr/deepmultilingualpunctuation
https://arxiv.org/abs/2311.00430
https://github.com/PyAV-Org/PyAV[
https://github.com/snakers4/silero-vad
https://github.com/m-bain/whisperX
https://amgadhasan.substack.com/p/sota-asr-tooling-long-form-transcription




0 comments on commit 2eff367

Please sign in to comment.