Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding dataset Kat-57 groundtruth dataset #173

Open
mirkh opened this issue Jan 20, 2025 · 1 comment
Open

Adding dataset Kat-57 groundtruth dataset #173

mirkh opened this issue Jan 20, 2025 · 1 comment

Comments

@mirkh
Copy link

mirkh commented Jan 20, 2025

Hello ! This is a dataset of transcribed card catalogue cards from Lund University Library.

Here is our dataset YAML file:

schema: https://htr-united.github.io/schema/2023-06-27/schema.json
title: Kat -57 ground truth dataset
url: https://zenodo.org/records/14679534
authors: []
institutions: []
description: >-
 Background


 Catalogue -1957 is an alphabetic library catalogue listing Lund University
 Library’s holdings up to 1957. A project has been ongoing to scan and
 transcribe the catalogue cards.


 About 10.000 cards were manually transcribed to create a ground truth dataset.


 From 2178 card drawers, one drawer for every letter in the alphabet was
 selected to transcribe, except for in the letter S, where two drawers were
 selected.


 The writing on the catalogue cards is a mix of typewriter and handwriting.
 There are more than 10 different hands.


 The cards were transcribed by a small team at the University Library.


 Dataset


 The set consists of PNG images with corresponding PAGE XML files. The
 transcriptions were made in eScriptorium.
language:
 - swe
 - deu
 - eng
 - fra
 - dan
 - nor
production-software: eScriptorium + Kraken
automatically-aligned: false
script:
 - iso: Latn
script-type: evenly-mixed
time:
 notBefore: '1880'
 notAfter: '1957'
hands:
 count: more-than-10
 precision: estimated
license:
 name: CC-BY 4.0
 url: https://creativecommons.org/licenses/by/4.0/
format: Page-XML
volume:
 - metric: pages
   count: 10000
@alix-tz
Copy link
Member

alix-tz commented Jan 27, 2025

Hello,

Thank you very much for your contribution to the catalog!

I see that the list of authors in empty, but you list some on the Zenodo repository. Is it intentional? Note that you can also list institutions such as the Lund University Library if this is relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants