This is the code to download the FUNSD dataset and extract the dataset used for the OCR comparison.
git clone https://github.com/OCRComparison/dataset.git
cd dataset
wget https://guillaumejaume.github.io/FUNSD/dataset.zip -O dataset.zip
unzip dataset.zip -d ./FUNSD/
python extract_dataset.py --input_path ./FUNSD/dataset/ --output_path ./OCRDataset/