Based on RapidOCR, extract the PDF content.
-
Updated
Aug 28, 2024 - Python
Based on RapidOCR, extract the PDF content.
Sample code for the Datalogics C++, Java, and .NET interfaces of the Adobe PDF Library
Sample code for the Datalogics C++ interface of the Adobe PDF Library
Sample code for the Datalogics .NET interface of the Adobe PDF Library
A powerful and user-friendly tool based on OCRmyPDF, offering a seamless GUI for conversion of image-based PDFs into searchable text.
Sample code for the Datalogics Java interface of the Adobe PDF Library setup to build with Maven
Sample code for the Datalogics .NET Framework interface of the Adobe PDF Library
This UiPath project developed during the STGI Hackathon, automates resume screening for HR teams. It extracts emails with a specified subject, saves PDF resumes, uses Tesseract OCR for data extraction. The extracted data is used to fill a form and at EOD, an audit report with insights and a CSV of responses is generated and sent to a specfied mail.
Simple frontend for OCRmyPDF (Windows only).
ocr resume
Example Django-Python project which contains OCR, PDF to OCR PDF, Text Similarity/Dissimilarity, PDF to PNG converter modules.
This UiPath project developed during the STGI Hackathon, automates resume screening for HR teams. It extracts emails with a specified subject, saves PDF resumes, uses Tesseract OCR for data extraction. The extracted data is used to fill a form and at EOD, an audit report with insights and a CSV of responses is generated and sent to a specfied mail.
Add a description, image, and links to the ocr-pdf topic page so that developers can more easily learn about it.
To associate your repository with the ocr-pdf topic, visit your repo's landing page and select "manage topics."