Skip to content

Commit

Permalink
Merge pull request #50 from narumiruna/youtube-transcript-languages
Browse files Browse the repository at this point in the history
Support specifying YouTube transcript language
  • Loading branch information
gagb authored Dec 17, 2024
2 parents 51c1453 + 2d3ffea commit 73776b2
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion src/markitdown/_markitdown.py
Original file line number Diff line number Diff line change
Expand Up @@ -351,8 +351,11 @@ def convert(
assert isinstance(params["v"][0], str)
video_id = str(params["v"][0])
try:
youtube_transcript_languages = kwargs.get(
"youtube_transcript_languages", ("en",)
)
# Must be a single transcript.
transcript = YouTubeTranscriptApi.get_transcript(video_id) # type: ignore
transcript = YouTubeTranscriptApi.get_transcript(video_id, languages=youtube_transcript_languages) # type: ignore
transcript_text = " ".join([part["text"] for part in transcript]) # type: ignore
# Alternative formatting:
# formatter = TextFormatter()
Expand Down

0 comments on commit 73776b2

Please sign in to comment.