-
Notifications
You must be signed in to change notification settings - Fork 753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to read multipage TIF file #50
Comments
Yes, reading multipage tiffs is supported by Leptonica, the imaging library used by Tesseract, however I haven't yet implemented support for this in the c# wrapper. The relevant function is `pixaReadMultipageTiff
|
This is something I could use as well. Is there any timetable on the implementation of reading multi-page tiffs? |
Sorry not right now, I might be able to find some time to look into this in a couple weeks but its not a priority for me at the moment. |
FYI, thanks to amferguson we will now support multi-page tiffs in the upcoming 1.1 release (tesseract 3.03). |
Is there any documentation or code samples to do a multipage tiff? I checked the existing code samples and there is nothing mentioned there. I could really use some guidance. |
I once did make an engine to detect the orientation of a scanned image with the help of Tesseract. In the time I programmed it there was no multipage tiff suport so I wrote some handy TIFF utilities... maybe that they are helpful to you. You can find them over here --> |
You'll want to use PixArray.LoadMultiPageTiffFromFile and then iterate and If you want feel free to create a wiki article on this (I think everyone On Wed, Jul 29, 2015 at 3:49 PM, Kees notifications@github.com wrote:
|
Here is a simple implementation to ocr a multipage tiff. Id add it to the wiki but looks like I cant create a page. `
` |
Thanks for the sample and no there isn't a way to load a PixArray from an On Tue, 26 Jul 2016, 17:51 RacerEvan55, notifications@github.com wrote:
|
Charles, from the post it is clear that, if we have a multipage tiff file, we can use engine.Process() for processing each page by looping through the PixArray. Is it possible wherein a multipage tiff file can be completely processed in a single attempt, in any of the newer versions of Tesseract ?? I am using Tesseract 3.0.2.0 and could only see the engine.Process() method has a few overloads which accepts Bitmap or a Pix etc.. |
@HK516 No you'll still need to process each page piecemill wise as detailed above. There haven't been any changes to the wrapper or tesseract underneath to allow processing in one lot. |
I have a TIF file that is multiple pages. The function Pix.LoadFromFile(filename) appears to be only loading the first page.
Is there a way to load all the pages?
I would like to be able to read the entire document at once.
Thanks
The text was updated successfully, but these errors were encountered: