-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]: Youtube data collector "Failed to locate a transcript for this video!" #2597
Comments
The IP your ubuntu instance is on is probably being blocked by Google from reaching It is also possible that when accessing the video from the Ubuntu IP the video is blocked in that geography associated with the IP. |
It's not a geo restriction, tried with different videos.
|
So any Youtube video does not work on this instance? |
exactly, tried 10 different videos from different regions. it works on the desktop app, not on the docker instance on Oracle VM |
When viewing the docker logs and attempting a collection do we see a Hoping this error fires anything-llm/collector/utils/extensions/YoutubeTranscript/YoutubeLoader/youtube-transcript.js Line 91 in 890fb29
|
this is the docker log after 3 tries
|
@stdestro Does this thread apply The script we are using is a fork of that repo - we broke from it a long time ago to force patch something in that data connector but thinking of the network difference I wonder if this is the issue and its because the ipv4 and ipv6 responses from youtube.com are different? |
so, for me i cannot ping ipv6
while ipv4
while from desktop i get this:
|
How are you running AnythingLLM?
Docker (remote machine)
What happened?
Trying to collect transcript from Youtube transcript data connector
i have a local installation on MacOs that collect the transcript, while the docker instance on Ubuntu gives the error:
Failed to locate a transcript for this video!
The video link is the same, so the video is not the problem (tried with different videos, same result).
i got the same LLM model, same llm Agent in desktop app and docker instance
Docker instance is started with --cap-add SYS_ADMIN
I can scrape websites through data collector smoothly in the docker instance.
the only problem is collecting transcripts from youtube
Are there known steps to reproduce?
No response
The text was updated successfully, but these errors were encountered: