Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can anyone help me how to use? #23

Open
Ruke805 opened this issue Jun 4, 2020 · 7 comments
Open

Can anyone help me how to use? #23

Ruke805 opened this issue Jun 4, 2020 · 7 comments

Comments

@Ruke805
Copy link

Ruke805 commented Jun 4, 2020

I'm trying to understand how to make it work, but it's all very confusing.
I'm using Windows 10, I already have Python installed, I already have tesseract working, added to PATH, but I don't know how to make it work. I tried to follow what is explained in this issue: #2
I created the file get_sub.py

I put the video in the same folder, I put all the scripts in the same folder but when I run, I get this error:

Traceback (most recent call last):
File "C:\Users\user\Programs\Python 3.7\venv\Lib\site-packages\videocr\get_sub.py", line 3, in
import video
File "C:\Users\user\Python 3.7\venv\Lib\site-packages\videocr\video.py", line 8, in
from . import constants
ImportError: attempted relative import with no known parent package

Someone please could help me?

@bassSoul
Copy link

bassSoul commented Jun 11, 2020

I am trying to figure it out as well, despite having absolutely no background with this stuff. I think it's working as I type this. I would extract the videocr folder downloaded from here to your desktop instead (just simpler). Have you installed pip and then videocr, like suggested under the installation section?

We will try to figure this out together
Edit: it ended up working for me like I said but the results were not great. I've moved on to using Subrip and FineReader, which is quite tedious.

@johan456789
Copy link

I've moved on to using Subrip and FineReader, which is quite tedious.

@bassSoul How well does Subrip work?

@bassSoul
Copy link

bassSoul commented Jul 3, 2020

@johan456789 Depends on the quality and font formatting of the text but overall does a good job if it's fairly standard. It sometimes generates duplicates or blanks, which you have to manually go through. This also could just be because the font I'm working with is brutal and I'm working with animation, which produces more false positives.

Note that ABBYY Finereader is required and subrip alone won't do the trick. You need to OCR the images into separate .txt files named according to your exported images. I believe only FineReader is capable of doing this as a batch export.

@p2635
Copy link

p2635 commented Aug 24, 2020

@theruleof4 unfortunately even those of us who can make it work aren't getting results. I suggest you look at the comments above to see if you can use SubRip or FineReader instead.

@Plaidstallion
Copy link

I was wondering if Subtitle Edit was capable of doing any or all of this process. It can OCR PGS subtitles with Tessaract. Seems like it should technically be able to read the images put out by VideoSubFinder. I can't get subrip to work, personally.

@Johaan01
Copy link

Johaan01 commented Aug 4, 2021

I dont know if im right, but isnt this code bricked due to the Tesseract Data File being moved?
In the README you can fin this link: this page
i was also checking some of the code on the constants.py file, some of the DatFile urls were also moved (and now the url give a 404 status), you can see for yourself:
TESSDATA_URL = 'https://github.com/tesseract-ocr/tessdata_fast/raw/master/{}.traineddata'

TESSDATA_SCRIPT_URL = 'https://github.com/tesseract-ocr/tessdata_best/raw/master/script/{}.traineddata'
i think it is possible to fix by updating the links but the since tesseract changed so much i doubt it would still work.

@devmaxxing
Copy link

FYI I've created a working fork that uses PaddleOCR instead of Tesseract: https://github.com/oliverfei/videocr-PaddleOCR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants