After supporting 130 languages, HanLP has officially released an open-source Ancient Chinese model. This model supports automatic word segmentation, lemmatization, part-of-speech tagging, and dependency parsing for Ancient Chinese. Thanks to multi-task learning, this single model can handle all of these tasks, as well as coarse-grained/fine-grained segmentation and UPOS/XPOS/PKU part-of-speech tagging sets.
- Blog post: https://www.hankcs.com/nlp/hanlp-ancient-chinese-processing-model-released.html
- Demo: https://github.com/hankcs/HanLP/blob/master/plugins/hanlp_demo/hanlp_demo/lzh/demo_mtl.py
- Performance: https://hanlp.hankcs.com/docs/api/hanlp/pretrained/mtl.html#hanlp.pretrained.mtl.KYOTO_EVAHAN_TOK_LEM_POS_UDEP_LZH
- Visualization:
Full Changelog: v2.1.0...v2.1.1