This repository contains the source code to extract the dialogs used in the following paper:

The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems arXiv:1506.08909.

psql -d template1
> create database ubuntu;

# ln -s /path/to/ubuntu/corpus data
# node createTable.js
# pypy main.py

This produces a file ubuntu.sql

# psql -d ubuntu
> copy messages from '/tmp/ubuntu.sql';

# node createTable.js index

# node extractDialogs.js nicks.txt

Provide feedback

Saved searches