Skip to content

Latest commit

 

History

History
13 lines (9 loc) · 1.37 KB

README.md

File metadata and controls

13 lines (9 loc) · 1.37 KB

dialogue_pretrain

This repository is used to store the data set of release in the paper Masking Orchestration: Multi-task Pretraining for Multi-role Dialogue Representation Learning(AAAI2020) Authors: Tianyi Wang, Yating Zhang, Xiaozhong Liu, Changlong Sun, Qiong Zhang

Data sets are encrypted to protect personal privacy information. The data contains three columns, and the data format is as follows. Each dialogue_id represents a complete trial process. The second column contains all utterances in a court, which consist of a list of words. Roles column represents different roles in the court, including judge, plaintiff, defendant, witness and others.

Download link: https://pan.baidu.com/s/1pJnr1Y9utFSKSm_xNod5nA code:ts8f

dialogue_ids utterances roles
fcb0f588-fd69-300e-b59e-7c855c3775d2 [3b920ce1-6ac8-34f1-99ef-7f91f77d7439, a4715ee... 9c45c2f1-1761-3daa-ad31-1ff8703ae846
0312afb7-0b6c-3f43-9b6b-923e1fcdefee [ecfec5aa-a247-3b06-af42-2d94711fc093, 97da321... afd0b036-625a-3aa8-b639-9dc8c8fff0ff
8a401daa-2a52-3c9c-a252-79cbb05f753f [b3cc8c7b-838e-32ca-8e07-65893cb8c064, 2763369... afd0b036-625a-3aa8-b639-9dc8c8fff0ff