Skip to content
/ AHDoc Public

Solution based on Text Detection with Differentiable Binarization and Adaptive Scale Fusion method and CNN-BLSTM-CTC OCR engine to convert a Handwritten Arabic document image into well-structured data.

License

Notifications You must be signed in to change notification settings

Hedrax/AHDoc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Arabic Handwritten Document Data-Listing Solution

network

How To Install

conda create -n ahdoc python=3.10
conda activate ahdoc

conda install ipython pip

# python dependencies
pip install -r requirement.txt

# clone repo
git clone https://github.com/Hedrax/AHDoc.git
cd AHDoc/

How to use our pre-trained text detection model in ONNX format

Download the .onnx best weights

Data

Text-Detection

Evaluation

OCR ENGINE

Model Best-Weights

Data Preparation

Text-Detection

  • Datasets must follow the format of the custom Data format provided above
  • utils.py has all functions needed to migrate from JSON with Pascal and txt with Yolo format
  • utils.py also has all functions needed to list image names and get corresponding label files to list in .txt file

OCR Engine

  • Datasets must follow the format all images in png in the same directory with labels.txt labels must be in the form imageNameWithoutExtension_groundTruth
  • utils.py has all functions needed for dataset preparation

Train

  • Follow instructions in train.py
  • All training configurations are saved in config.py

Test

  • put the weights files downloaded from the above reference to ./Text Detection/weights/ and ./OCR Engine/weights/
  • Follow instructions in inference.py and evaluation.py in terms of OCR Engine
  • All test configurations are saved in config.py

Results

Text-Detection

We compare best-weights of universal model performance on our custom evaluation Arabic handwritten data

Weights Precision (%) Recall (%) F-measure (%)
Universal Model 61.53 34.60 41.33
Our-Model 81.66 78.82 79.07

OCR Module Performance Results

Our results on the TEST set of 18-fonts

# Number of Words Solid Accuracy% Salted Accuracy% Bolded Accuracy% Notes
CRR WRR CRR WRR CRR WRR
1 1 94.28 70.05 91.85 57.08 77.25 17.81 Tested on 7-Character Words
2 1 94.24 54.06 91.42 50.94 90.19 46.04 -
3 2 89.81 39.38 87.11 34.84 86.75 33.85 -
4 3 89.64 35.63 88.23 37.39 87.79 35.59 -
5 4 82.23 28.59 80.41 24.61 80.25 24.61 -
6 5 73.17 20.25 71.88 17.225 70.25 16.62 -
7 6 66.01 18.78 64.95 13.48 63.50 14.39 -

Demo

Text-Detection

Image 1 Image 2

Image 1 Image 2

Image 1 Image 2

OCR-ENGINE

Image 1

reference

  1. https://arxiv.org/abs/1911.08947
  2. https://github.com/zonasw/DBNet
  3. https://github.com/MhLiao/DB

Citation

If you used our work in your research feel free to cite us

@inproceedings{waly2024arabic,
  title={Arabic Handwritten Document OCR Solution with Binarization and Adaptive Scale Fusion Detection},
  author={Waly, Alhossien and Tarek, Bassant and Feteha, Ali and Yehia, Rewan and Amr, Gasser and Fares, Ahmed},
  booktitle={2024 6th Novel Intelligent and Leading Emerging Sciences Conference (NILES)},
  pages={316--319},
  year={2024},
  organization={IEEE}
}

About

Solution based on Text Detection with Differentiable Binarization and Adaptive Scale Fusion method and CNN-BLSTM-CTC OCR engine to convert a Handwritten Arabic document image into well-structured data.

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •  

Languages