Skip to content

Commit 57998bc

Browse files
authoredFeb 10, 2019
NMF files uploaded
1 parent b866fe5 commit 57998bc

36 files changed

+1015625
-0
lines changed
 

‎NMF model folders/SeaNMF-master/LICENSE

+674
Large diffs are not rendered by default.

‎NMF model folders/SeaNMF-master/NMF Short text topic modelling.ipynb

+1,015
Large diffs are not rendered by default.
+26
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# SeaNMF
2+
3+
This the implementation of the paper
4+
- [Tian Shi](http://life-tp.com/Tian_Shi/), Kyeongpil Kang, [Jaegul Choo](https://sites.google.com/site/jaegulchoo/) and [Chandan K. Reddy](http://people.cs.vt.edu/~reddy/), "Short-Text Topic Modeling via Non-negative Matrix Factorization Enriched with Local Word-Context Correlations", In Proceedings of the International Conference on World Wide Web (WWW), Lyon, France, April 2018. [PDF](http://dmkd.cs.vt.edu/papers/WWW18.pdf)
5+
6+
## Requirements
7+
8+
- Python 3.5.2
9+
- argparse
10+
11+
## usage:
12+
13+
#### Data Process
14+
- Tokenize with [NLTK](https://www.nltk.org/), [SpaCy](https://spacy.io/) or [CoreNLP](https://stanfordnlp.github.io/CoreNLP/)
15+
- Remove special characters.
16+
- Remove stop-words.
17+
- Edit the argument of ``` data_process.py ```
18+
- Run ```python3 data_process.py``` to prepare the document-term matrix and vocabulary.
19+
20+
#### Train
21+
22+
- Run ```python3 train.py --help``` to see the full list of options.
23+
24+
#### Evaluation
25+
26+
- Run ```python3 vis_topic.py``` to calculate the PMI and visualize the top keywords in each topic.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

0 commit comments

Comments
 (0)