Fake News Challenge

Prajwal Rao (5176504) & Julian Blair (3463793) | COMP9417 2018s1

Acknowledgement

The Fake News Challenge was hosted in 2017 by a group of academic and industry volunteers. Learn more about the challenge here.

Thanks to the same team for providing a baseline implementation, which was used as a starting point for our project. The GitHub repository for the baseline can be found here.

Install

Python packages can be installed via commandline using:

pip install [packagename]

NLTK packages can be installed within Python using:

import nltk nltk.download('[packagename]')

Prerequisites

Python 3
pip packages
- numpy
- scipy
- sklearn
- tqdm
- nltk (see below)
nltk packages
- punkt
- wordnet
- averaged_perceptron_tagger

Navigation

The data folder contains the CSVs provided for the challenge as training data, testing data, and competition benchmarking data.

The src folder contains the baseline code provided, as well as implementations built upon the baseline by us.

Source subfolders

Subfolder	Version	Description
baseline	0	The baseline provided.
word_overlap	1	Restructures the classificaiton problem from multi-class to multi-tier two-class, and modifies the word_overlap feature to filter common words.
paraphrasing	2	Adds lexical dimensionality-reduction to existing n-gram features.
final	3	Adds Naive Bayes classifier and all-caps frequency feature.

Each source subfolder contains the following files and folders:

Item	Description
fnc_kfold.py	The main execution script. Extracts CSVs, generates data splits for training/testing, precomputes features, then fits the hold-out set and test data.
feature_engineering.py	Contains helper functions for feature precomputation.
features/	Contains precomputed feature files for later use in fitting the provided data.
utils/	Contains helper functions provided in baseline for dataset generation, test split creation, and scoring.

Execution

To run the feature generation and scoring process, navigate to any of these source subfolders and run the following commnad:

python|python3 fnc_kfold.py

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
data		data
src		src
.gitignore		.gitignore
Discussion.txt		Discussion.txt
README.md		README.md
SUBMISSION INSTRUCTIONS.txt		SUBMISSION INSTRUCTIONS.txt
benchmarking.xlsx		benchmarking.xlsx
report.docx		report.docx
report.pdf		report.pdf
scorer.py		scorer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fake News Challenge

Acknowledgement

Install

Prerequisites

Navigation

Source subfolders

Execution

About

Releases

Packages

Languages

prajwaln/Fake-News-Challenge

Folders and files

Latest commit

History

Repository files navigation

Fake News Challenge

Acknowledgement

Install

Prerequisites

Navigation

Source subfolders

Execution

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages