Online segmentation and transcription for live streams #18

lmmx · 2021-04-06T22:27:37Z

[Motivated by a feature request to handle live streams in beeb, placeholder notes for when I'm ready to implement transcription from live streams following this]

Speaker segmentation is the only part that complicates the workflow here.

Essentially there's an "offline" segmentation (i.e. done after the show has ended), but we could switch to an "online" segmentation:

If the show has begun, 'rewind' by stepping back to the first M4S stream segment and download all
- (beeb will handle this)
While the show is on air, continue to download each new M4S stream segment until the last (at which time the show will go off air)
- (again, beeb will handle this)
As soon as possible, merge all the downloaded M4S stream segments, and label the "low energy" time points
- Now (ignoring the potentially ongoing stream segment downloads) split the merged stream at these points and produce segmented audio clips
- Don't use the last of these clips (the one that runs to the end of the merged audio stream)! Its end is potentially going to be cut off in the middle of someone saying something just because it's at the end. Keep that one for the next iteration
- ...

[TBC]

Due to a limitation in the models I'm using (maximum token sequence length) I can't actually input an entire programme to these steps. In that sense there's no benefit gained from waiting until a programme finishes to build the MP4.

If I implemented live transcription, I'd get the end result much sooner (as I could begin processing the audio while the show was still on air), so I'd be interested in this too.

lmmx added the enhancement New feature or request label Apr 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Online segmentation and transcription for live streams #18

Online segmentation and transcription for live streams #18

lmmx commented Apr 6, 2021 •

edited

Loading

Online segmentation and transcription for live streams #18

Online segmentation and transcription for live streams #18

Comments

lmmx commented Apr 6, 2021 • edited Loading

lmmx commented Apr 6, 2021 •

edited

Loading