Skip to content

This project is a research-driven product that has the potential to revolutionize the data entry process by seamlessly integrating highly accurate automation. The core of this work involves an in-depth comparative analysis, on various approaches to recognize complex structures.

License

Notifications You must be signed in to change notification settings

RajKrishna2123/capstone_project

Repository files navigation

Capstone project : Structural OCR

Innovations in Document Analysis: Exploring Keras Segmentation and Graph Convolutional Neural Networks for Structural OCR

Dataset link

1 million table segmentation data Nearly 1 million tables from scientific articles with bounding box annotations
link :: https://www.kaggle.com/datasets/bsmock/pubtables-1m-structure

Description

this project was earlier developed under fire llama company as part of my internship. I am further developing it. This project is under development as my last semester project for the subject UCF 439 Capstone project

Table of Contents

Project Architecture

2

Requirements

To run this project, you'll need the following specific dependencies:

  • Python 3.7
  • TensorFlow GPU 2.4.1 with CUDA 11.0 and cuDNN 8.0
  • keras 2.4.3
  • imgaug
  • opencv 4.5.9
  • django

You can install the required Python packages using the following command:

pip install <package_name>==<version>

Example

pip install tensorflow_gpu==2.4.1

Installation

There are two ways to get started with the project, follow these steps:

  • Simple installation where you have to satisfy all dependency in your os.
  • By using docker container.
  1. Clone this repository:

    git clone https://github.com/RajKrishna2123/capstone_project
    cd capstone_project
    pip install -r requirements.txt
  2. Use docker container

    docker build -t project_container:updated1 .

    following command will run the project

    docker run --gpus all -it -v D:/struct_ocr_data:/app -p 8000:8000 project_cotainer:updated1 /bin/bash

    once container is up and running, in case of lost connection or accidentally closed terminal then to reconnect to same container use following command

    docker exec -it <container_id> bash

Usage

This project can be used to convert your bulk/single images into editable formatted structured as it was in image into a relational table at once

Features

  1. Extensive Training Data: The implemented AI model is trained over an extensive dataset of 1 million high resolution images. This ensures the system's robustness and accuracy in document structure identification.

  2. MLOps Integration: Our implementation adheres to MLOps practices, ensuring a seamless and automated end-to-end workflow. Continuous integration and delivery pipelines will be established for efficient model deployment and updates.

  3. Containerization: The system will be containerized for deployment as a web app and API service. This promotes scalability and ease of integration into various applications.

  4. Google Drive Integration: A unique feature allows users to effortlessly process bulk data by providing Google Drive links.

  5. Flexible Data Outputs: Another unique feature that system supports versatile data outputs, including CSV, MySQL databases, and XLSX, catering to diverse data management preferences.

  6. Integrated API Service: Integrated API capabilities will provide other developers with easy access to incorporate Structural OCR functionalities into their applications, enhancing overall system accessibility.

Documentation

Link to external documentation or detailed guides.

Credits

Special thanks to Rajeev Ratan sir for their awesome repository! that supported this project a lot.

Acknowledgements

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

This project is a research-driven product that has the potential to revolutionize the data entry process by seamlessly integrating highly accurate automation. The core of this work involves an in-depth comparative analysis, on various approaches to recognize complex structures.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •