An image-based document classification system that automatically categorizes documents into predefined classes using advanced deep learning models like EfficientNet, ResNet, and Vision Transformers (ViT).
This project provides an image-based document classification system that automatically classifies document images into predefined categories using deep learning models like EfficientNet, ResNet, and Vision Transformers (ViT).
These instructions will guide you through setting up the project on your local machine for development and testing purposes.
Before you begin, make sure you have the following installed:
-
Git
Git is required to clone the repository:
Download Git Verify Git Installationgit --version
-
UV
An extremely fast Python package and project manager, written in Rust. You can read the uv documentation. Verify uv Installationuv version
-
Make
Make is a build utility that simplifies the process of building, testing, and packaging software.
You can read the Make documentation.Verify Make Installation
Run the following command to check if Make is installed:make --version
Clone the project from GitHub:
git clone https://github.com/fiqihfathor/financial_document_classification.git
cd financial_document_classification
Install the project using the following command:
uv sync
Run the tests using the following command:
make test
Donwload Dataset
make dataset
Train Model
make train
You can change the configuration in config/config.yml
Test API
make server
and you can access it on http://localhost:8000
- Python: The powerhouse of programming languages, enabling versatility and efficiency.
- PyTorch: Cutting-edge deep learning framework for building complex models with ease.
- FastAPI: The lightning-fast web framework to power your API with speed and simplicity.
- UV: An ultra-fast project manager that makes dependency management a breeze.
- Make: The trusted build utility to streamline your software development process.
- Git: The version control system that keeps your code organized and in control.
- MLflow: The open-source platform for managing and tracking machine learning experiments.
- Loguru: The most powerful and user-friendly logging library to simplify your code’s logging.