Skip to content

Commit

Permalink
Update treebank.md
Browse files Browse the repository at this point in the history
  • Loading branch information
wannaphong authored Oct 18, 2024
1 parent cce005d commit 4087369
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion docs/tasks/treebank.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,6 @@
| ----------------------------- | ------------------------------------------------------------ | ---------------------------------- | ------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
| UD Thai PUD | This is a part of the Parallel Universal Dependencies (PUD) treebanks created for the CoNLL 2017 shared task on Multilingual Parsing from Raw Text to Universal Dependencies. | 1,000 sentences | CC BY-SA 3.0 | Universal Dependencies | [GitHub](https://github.com/UniversalDependencies/UD_Thai-PUD) |
| Thai Treebanks Dataset (thtb) | To enable research oppotunities with very few Thai Computational Linguitic resources, we willingly introduce fundamental high-level language resouces built with passion, Thai Treebanks, build from scratch for researchers and enthusiasts. | 5,200 sentences | CC BY 4.0 | Pechlada Seenual, Thodsaporn Chay-intr and Thanaruk Theeramunkong | [GitHub](https://github.com/tchayintr/thtb) |
| Blackboard Treebank | Blackboard Treebank is a Thai dependency corpus based on the LST20 Annotation Guideline. It features dependency structures, constituency structures, word boundaries, named entities, clause boundaries, and sentence boundaries. | 122,851 clauses (38,558 sentences) | CC BY 3.0 | Prachya Boonkwan, NECTEC | [bitbucket](https://bitbucket.org/kaamanita/blackboard-treebank/) |
| Blackboard Treebank | Blackboard Treebank is a Thai dependency corpus based on the LST20 Annotation Guideline. It features dependency structures, constituency structures, word boundaries, named entities, clause boundaries, and sentence boundaries. | 122,851 clauses (38,558 sentences) | CC BY 3.0 | Prachya Boonkwan, NECTEC | [bitbucket](https://bitbucket.org/kaamanita/blackboard-treebank/) |
| Thai Universal Dependency Treebank (TUD) | Thai Universal Dependency Treebank, consisting of 3,627 trees annotated in accordance with the Universal Dependencies (UD) framework. | 3,627 trees | | Chulalongkorn University | [GitHub](https://github.com/nlp-chula/TUD) |
| Thai Discourse Treebank | Thai Discourse Treebank is the first and largest Thai corpus annotated with explicit discourse relations in the style of the English Penn Discourse Treebank 3 scheme. The final corpus consists of 10,602 sentences from 384 documents, 180 of which have complete annotation of discourse connectives and its two argument spans. | | | Ponrawee Prasertsom, Apiwat Jaroonpol, Attapol T. Rutherford | [GitHub](https://github.com/nlp-chula/thai-discourse-treebank/tree/main/data/th-tdtb) |

0 comments on commit 4087369

Please sign in to comment.