webvtt.read_buffer doesnt work after upgrading to 0.4.4 #29

theaiinstitute · 2020-04-06T15:10:52Z

Hi,
Here's the full script i've ran with webvtt-py version 0.4.4

from io import StringIO
import urllib.request
import webvtt

url = 'https://course-recording-q1-2020-taii.s3.eu-west-3.amazonaws.com/us/GMT20200117-205611_AI-Inst--U.transcript.vtt'
response = urllib.request.urlopen(url)
data = response.read() 
text = data.decode('utf-8')
buffer = StringIO(text)

for l in webvtt.read_buffer(buffer):
    print(l.text)

this script shows nothing, but when i print the variable text, it actually shows a lot of content. I think there's some problem with the function read_buffer in version 0.4.4. That is because when I just downgraded the version to 0.4.3 then everything worked fine.
Please review this!

The text was updated successfully, but these errors were encountered:

glut23 · 2020-04-09T09:39:26Z

Hi @theaiinstitute I released 0.4.5 with a fix. Please confirm this resolves the issue. Thanks!

theaiinstitute · 2020-04-13T19:49:37Z

It works! Thank for your quick reaction, appreciate that!

igifar · 2022-10-16T11:21:05Z

Unfortunately, there is a problem for me too. Three years later, the problem still exists

`url = 'https://hls.ted.com/project_masters/7970/subtitles/ja/full.vtt'
result = ''

response = urllib.request.urlopen(url)
data = response.read()
text = data.decode('utf-8')
buffer = StringIO(text)

for l in webvtt.read_buffer(buffer):
    print(l.text)`

File "C:\Users\FARSHAD\AppData\Local\Programs\Python\Python310\lib\site-packages\webvtt\webvtt.py", line 68, in read_buffer
parser = WebVTTParser().read_from_buffer(buffer)
File "C:\Users\FARSHAD\AppData\Local\Programs\Python\Python310\lib\site-packages\webvtt\parsers.py", line 33, in read_from_buffer
self._parse(content)
File "C:\Users\FARSHAD\AppData\Local\Programs\Python\Python310\lib\site-packages\webvtt\parsers.py", line 214, in _parse
self._parse_blocks(blocks)
File "C:\Users\FARSHAD\AppData\Local\Programs\Python\Python310\lib\site-packages\webvtt\parsers.py", line 250, in _parse_blocks
raise MalformedCaptionError(
webvtt.errors.MalformedCaptionError: Standalone cue identifier in line 128.

glut23 added a commit that referenced this issue Apr 9, 2020

Fix parsing issue #29

8538e1a

glut23 closed this as completed Apr 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

webvtt.read_buffer doesnt work after upgrading to 0.4.4 #29

webvtt.read_buffer doesnt work after upgrading to 0.4.4 #29

theaiinstitute commented Apr 6, 2020

glut23 commented Apr 9, 2020

theaiinstitute commented Apr 13, 2020

igifar commented Oct 16, 2022 •

edited

Loading

webvtt.read_buffer doesnt work after upgrading to 0.4.4 #29

webvtt.read_buffer doesnt work after upgrading to 0.4.4 #29

Comments

theaiinstitute commented Apr 6, 2020

glut23 commented Apr 9, 2020

theaiinstitute commented Apr 13, 2020

igifar commented Oct 16, 2022 • edited Loading

igifar commented Oct 16, 2022 •

edited

Loading