You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Source codes for ``Efficient Classification of Long Documents Using Transformers''
Please refer to our paper for more details and cite our paper if you find this repo useful:
@inproceedings{park-etal-2022-efficient,
title = "Efficient Classification of Long Documents Using Transformers",
author = "Park, Hyunji and
Vyas, Yogarshi and
Shah, Kashif",
booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)",
month = may,
year = "2022",
address = "Dublin, Ireland",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.acl-short.79",
doi = "10.18653/v1/2022.acl-short.79",
pages = "702--709",
}
Running train.py with the --data 20news flag will download and prepare the data available via sklearn.datasets (following CogLTX).
We adopt the train/dev/test split from this ToBERT paper.