Skip to content

Algorithms course materials for the Lede program at Columbia Journalism School

Notifications You must be signed in to change notification settings

ella24/lede-algorithms

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Algorithms - Lede 2018

A course on algorithms for doing journalis.

Course overview

This is a course on algorithmic data analysis in journalism. We will cover basic methods for working with large(ish) data sets, and a variety of techniques used in story production, from regression to simulation to machine learning.

There are basically two different ways algorithms are combined journalism: we can use algorithms to analyze data to produce stories, and as we can do stories about algorithms that affect people's lives. We will do both.

  • Instructor: Jonathan Stray, jms2361@columbia.edu
  • Dates: Mondays and Wednesdays, 7/18-8/29
  • Class: 10am-1pm
  • Location: World Room
  • Lab: 2pm-5pm
  • Slack channel: #algorithms

Schedule

This is a rough outline, and subject to change, but your homework assignments will always be up to date!

Every Monday, you must bring in an algorithmic story to share with the class.

Homework is due before the following class.

Week 1 - Introduction to Algorithms

Algorithms for doing journalism, journalism about algorithms. The purpose of mathematical formalism. Homework:

  • Assignment notebook. Show that an average of averages is not the same as the overall average. Work out when the overall average and an average of averages are equal. Show that this really works, by computing the values in Jupyter.

Week 2 - Text Processing

In this class we will develop the ubiquitous vector space document model, with TF-IDF weighting. You will learn to algorithmically summarize documents by extracting keywords, how to compare documents for similarity, and how a search engine and Google News work.

References:

Homework:

  • Assignment notebook Analyze the State of the Union speeches in the 20th century to see how topics changed by decade (see notebook assignment)

Week 3 - Regression

Week 4 - Machine Learning

Week 5 - Network Anaysis

Week 6 - Simulations 1

Week 7 - Simulations 2

About

Algorithms course materials for the Lede program at Columbia Journalism School

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%