Skip to content

Applied Machine Learning coursework at McGill University

Notifications You must be signed in to change notification settings

alexanderhale/COMP551

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

94 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

COMP 551

Applied Machine Learning coursework at McGill University.

Project 1 - Introduction

As an introduction to practical machine learning, these are implementations (from scratch) of logistic regression with gradient descent and linear discriminant analysis. Other introductory steps include loading and cleaning the data set, extracting some basic additional features, creating a training-validation-test split, and using K-fold validation to evaluate the models.

Project 2 - Reddit Comment Classification

This project classifies uses the text of Reddit comments to classify them by the subreddit in which they were posted. The data is pre-processed and feature vectors are extracted, then fed into the models: Bernoulli Naive Bayes (from scratch), Multinomial Naive Bayes (from SciKit-Learn), and a LSTM neural network (using PyTorch). Results were submitted to this Kaggle leaderboard of McGill and University of Montreal graduate-level students.

Project 3 - MNIST Image Classification

This classifier uses a convolutional neural network to perform classification on a modified version of the MNIST dataset. Pre-processing techniques and CNN structure are varied for optimal performance. The datasets are too large to be included directly in the repository, but can be found on Kaggle, along with the competition leaderboard.

Project 4 - Paper Reproduction of CNN for Sentence Classification

This is a reproduction of the results in Yoon Kim's Convolutional Neural Networks for Sentence Classification. writeup.pdf contains the full report of this project.

The CNN repdroduction and hyperparameter tuning is performed in ccn-rand.ipynb. This file is configured to run locally, but executing it on a GPU will greatly reduce runtime.

Data pre-processing code from Kim's original paper can be found in OriginalCode.ipynb. The LSTM and Naive Bayes model implementations are located in BasicModels.ipynb. These files were written to be run on Google Colab, and the "baseFilepath" variable must match the location of the data in your Google Drive.

About

Applied Machine Learning coursework at McGill University

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published