Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

webvtt.read_buffer doesnt work after upgrading to 0.4.4 #29

Closed
theaiinstitute opened this issue Apr 6, 2020 · 3 comments
Closed

webvtt.read_buffer doesnt work after upgrading to 0.4.4 #29

theaiinstitute opened this issue Apr 6, 2020 · 3 comments

Comments

@theaiinstitute
Copy link

Hi,
Here's the full script i've ran with webvtt-py version 0.4.4

from io import StringIO
import urllib.request
import webvtt

url = 'https://course-recording-q1-2020-taii.s3.eu-west-3.amazonaws.com/us/GMT20200117-205611_AI-Inst--U.transcript.vtt'
response = urllib.request.urlopen(url)
data = response.read() 
text = data.decode('utf-8')
buffer = StringIO(text)

for l in webvtt.read_buffer(buffer):
    print(l.text)

this script shows nothing, but when i print the variable text, it actually shows a lot of content. I think there's some problem with the function read_buffer in version 0.4.4. That is because when I just downgraded the version to 0.4.3 then everything worked fine.
Please review this!

glut23 added a commit that referenced this issue Apr 9, 2020
@glut23
Copy link
Owner

glut23 commented Apr 9, 2020

Hi @theaiinstitute I released 0.4.5 with a fix. Please confirm this resolves the issue. Thanks!

@theaiinstitute
Copy link
Author

It works! Thank for your quick reaction, appreciate that!

@glut23 glut23 closed this as completed Apr 14, 2020
@igifar
Copy link

igifar commented Oct 16, 2022

Unfortunately, there is a problem for me too. Three years later, the problem still exists

`url = 'https://hls.ted.com/project_masters/7970/subtitles/ja/full.vtt'
result = ''

response = urllib.request.urlopen(url)
data = response.read()
text = data.decode('utf-8')
buffer = StringIO(text)

for l in webvtt.read_buffer(buffer):
    print(l.text)`

File "C:\Users\FARSHAD\AppData\Local\Programs\Python\Python310\lib\site-packages\webvtt\webvtt.py", line 68, in read_buffer
parser = WebVTTParser().read_from_buffer(buffer)
File "C:\Users\FARSHAD\AppData\Local\Programs\Python\Python310\lib\site-packages\webvtt\parsers.py", line 33, in read_from_buffer
self._parse(content)
File "C:\Users\FARSHAD\AppData\Local\Programs\Python\Python310\lib\site-packages\webvtt\parsers.py", line 214, in _parse
self._parse_blocks(blocks)
File "C:\Users\FARSHAD\AppData\Local\Programs\Python\Python310\lib\site-packages\webvtt\parsers.py", line 250, in _parse_blocks
raise MalformedCaptionError(
webvtt.errors.MalformedCaptionError: Standalone cue identifier in line 128.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants