This repository is used to store the data set of release in the paper Masking Orchestration: Multi-task Pretraining for Multi-role Dialogue Representation Learning(AAAI2020) Authors: Tianyi Wang, Yating Zhang, Xiaozhong Liu, Changlong Sun, Qiong Zhang
Data sets are encrypted to protect personal privacy information. The data contains three columns, and the data format is as follows. Each dialogue_id represents a complete trial process. The second column contains all utterances in a court, which consist of a list of words. Roles column represents different roles in the court, including judge, plaintiff, defendant, witness and others.
Download link: https://pan.baidu.com/s/1pJnr1Y9utFSKSm_xNod5nA code:ts8f
dialogue_ids | utterances | roles |
---|---|---|
fcb0f588-fd69-300e-b59e-7c855c3775d2 | [3b920ce1-6ac8-34f1-99ef-7f91f77d7439, a4715ee... | 9c45c2f1-1761-3daa-ad31-1ff8703ae846 |
0312afb7-0b6c-3f43-9b6b-923e1fcdefee | [ecfec5aa-a247-3b06-af42-2d94711fc093, 97da321... | afd0b036-625a-3aa8-b639-9dc8c8fff0ff |
8a401daa-2a52-3c9c-a252-79cbb05f753f | [b3cc8c7b-838e-32ca-8e07-65893cb8c064, 2763369... | afd0b036-625a-3aa8-b639-9dc8c8fff0ff |