GitHub - RajKrishna2123/capstone_project: This project is a research-driven product that has the potential to revolutionize the data entry process by seamlessly integrating highly accurate automation. The core of this work involves an in-depth comparative analysis, on various approaches to recognize complex structures.

Capstone project : Structural OCR

Innovations in Document Analysis: Exploring Keras Segmentation and Graph Convolutional Neural Networks for Structural OCR

Dataset link

1 million table segmentation data Nearly 1 million tables from scientific articles with bounding box annotations
link :: https://www.kaggle.com/datasets/bsmock/pubtables-1m-structure

Description

this project was earlier developed under fire llama company as part of my internship. I am further developing it. This project is under development as my last semester project for the subject UCF 439 Capstone project

Project Architecture

Requirements

To run this project, you'll need the following specific dependencies:

Python 3.7
TensorFlow GPU 2.4.1 with CUDA 11.0 and cuDNN 8.0
keras 2.4.3
imgaug
opencv 4.5.9
django

You can install the required Python packages using the following command:

pip install <package_name>==<version>

Example

pip install tensorflow_gpu==2.4.1

Installation

There are two ways to get started with the project, follow these steps:

Simple installation where you have to satisfy all dependency in your os.
By using docker container.

Clone this repository:

git clone https://github.com/RajKrishna2123/capstone_project

cd capstone_project

pip install -r requirements.txt

Use docker container
```
docker build -t project_container:updated1 .
```
following command will run the project
```
docker run --gpus all -it -v D:/struct_ocr_data:/app -p 8000:8000 project_cotainer:updated1 /bin/bash
```
once container is up and running, in case of lost connection or accidentally closed terminal then to reconnect to same container use following command
```
docker exec -it <container_id> bash
```

Usage

This project can be used to convert your bulk/single images into editable formatted structured as it was in image into a relational table at once

Features

Extensive Training Data: The implemented AI model is trained over an extensive dataset of 1 million high resolution images. This ensures the system's robustness and accuracy in document structure identification.
MLOps Integration: Our implementation adheres to MLOps practices, ensuring a seamless and automated end-to-end workflow. Continuous integration and delivery pipelines will be established for efficient model deployment and updates.
Containerization: The system will be containerized for deployment as a web app and API service. This promotes scalability and ease of integration into various applications.
Google Drive Integration: A unique feature allows users to effortlessly process bulk data by providing Google Drive links.
Flexible Data Outputs: Another unique feature that system supports versatile data outputs, including CSV, MySQL databases, and XLSX, catering to diverse data management preferences.
Integrated API Service: Integrated API capabilities will provide other developers with easy access to incorporate Structural OCR functionalities into their applications, enhancing overall system accessibility.

Documentation

Link to external documentation or detailed guides.

Credits

Special thanks to Rajeev Ratan sir for their awesome repository! that supported this project a lot.

Acknowledgements

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
anno_tools		anno_tools
backup_codes		backup_codes
data_preprocessing_pipeline		data_preprocessing_pipeline
final_project_dir		final_project_dir
front-end		front-end
presentation		presentation
saved_model		saved_model
work_sample		work_sample
.gitattributes		.gitattributes
Dockerfile		Dockerfile
License.md		License.md
README.md		README.md
gan - Copy.ipynb		gan - Copy.ipynb
gpu_memory_limit_allocator.py		gpu_memory_limit_allocator.py
mobilenet_segnet_model.png		mobilenet_segnet_model.png
project_architecture.gif		project_architecture.gif
requirements.txt		requirements.txt
table.txt		table.txt
test.ipynb		test.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Capstone project : Structural OCR

Dataset link

Description

Table of Contents

Project Architecture

Requirements

Installation

Usage

Features

Documentation

Credits

Acknowledgements

License

About

Releases

Packages

Contributors 3

Languages

License

RajKrishna2123/capstone_project

Folders and files

Latest commit

History

Repository files navigation

Capstone project : Structural OCR

Dataset link

Description

Table of Contents

Project Architecture

Requirements

Installation

Usage

Features

Documentation

Credits

Acknowledgements

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages