CS 4650 and CS 7650 will meet jointly, on Tuesdays and Thursdays from 3:05 - 4:25PM, in College of Computing 101.
This is a (permanently) provisional schedule. Readings, notes, slides, and homework will change. Readings and homeworks are final at the time of the class before they are due (e.g., thursdays readings are final on the preceding tuesday); problem sets are final on the day they are "out." Please check for updates until then.
- History of NLP and modern applications. Review of probability.
- Reading: Chapter 1 of Linguistic Fundamentals for NLP. You should be able to access this PDF for free from a Georgia Tech computer.
- Optional reading: Section 2.1 of Foundations of Statistical NLP. A PDF version is accessible through the GT library.
- Optional reading includes these other reviews of probability.
- Project 0 out
- Slides
- Bag-of-words models, naive Bayes, and sentiment analysis.
- Homework 1 due
- Reading: my notes, chapter 3.
- Optional readings: Sentiment analysis and opinion mining, especially parts 1, 2, 4.1-4.3, and 7; Chapters 0-0.3, 1-1.2 of LXMLS lab guide
- Slides
- Discriminative classifiers: perceptron and passive-aggressive learning; word-sense disambiguation.
- Problem set 0 due
- Problem set 1a out
- Reading: my notes, chapter 5-5.2.
- Optional supplementary reading: Parts 4-7 of log-linear models; survey on word sense disambiguation
- Optional advanced reading: adagrad; passive-aggressive learning
- Slides
- Logistic regression and online learning
- Homework 2 due
- Reading: my notes, chapter 5.3-5.6.
- Optional supplementary reading: Parts 4-7 of log-linear models
- Slides
- Problem set 1a due on September 3 at 3pm
- Problem set 1b out on September 3 at 3pm
- Reading: Expectation maximization chapter by Michael Collins
- Optional supplementary reading: Tutorial on EM
- Optional advanced reading: Nigam et al; Word sense clustering
- Demo: Word sense clustering with EM
- Slides
- N-grams, smoothing, speech recognition
- Reading: Language modeling
- Homework 3 due
- Optional advanced reading: An empirical study of smoothing techniques for language models, especially sections 2.7 and 3 on Kneser-Ney smoothing; A hierarchical Bayesian language model based on Pitman-Yor processes (requires some machine learning background)
- Slides
- Demo
- Problem set 1b due on September 10 at 3pm
- Reading: Knight and May (section 1-3)
- Supplemental reading: my notes, chapter 9-9.3; Jurafsky and Martin chapter 2.
- Transduction and composition, edit distance
- Homework 4 due
- Reading: Chapter 2 of Linguistic Fundamentals for NLP
- Reading: my notes, chapter 9.4- (not done yet)
- Optional reading: OpenFST slides.
- More formal additional reading: Weighted Finite-State Transducers in speech recognition
- Part-of-speech tags, hidden Markov models.
- Problem set 2a out
- Reading: likely my notes
- Optional reading: Tagging problems and hidden Markov models
- Viterbi, the Forward algorithm, and B-I-O encoding.
- Homework 5 due
- Reading: Conditional random fields
- Optional reading: CRF tutorial; Discriminative training of HMMs
- Discriminative structure prediction, conditional random fields, and the forward-backward algorithm.
- Problem set 2a due
- Problem set 2b out
- Reading: Forward-backward
- Optional reading: Two decades of unsupervised POS tagging: how far have we come?
- Context-free grammars; constituency; parsing
- Homework 6 due
- Reading: Probabilistic context-free grammars and possibly my notes.
- Problem set 2b due
- Reading: my notes
- Optional reading: Eisner algorithm worksheet; Characterizing the errors of data-driven dependency parsing models; Short textbook on dependency parsing, PDF should be free from a GT computer.
- Homework 7 due
- Minimal review notes from 2013
- Notes
- Problem set 3 out
- Reading: Lexicalized PCFGs
- Optional reading: Accurate unlexicalized parsing
- Homework 8 due
- Mostly CCG, but a little about L-TAG and and HPSG.
- Problem set 3 due
- Reading: likely my notes, unless I can find something good
- Optional reading: Intro to CCG; The inside-outside algorithm; Corpus-based induction of linguistic structure; Much more about CCG; LTAG; Probabilistic disambiguation models for wide-coverage HPSG
- Homework 9 due
- Reading: Manning: Intro to Formal Computational Semantics
- Optional reading: Learning to map sentences to logical form;
- Frame semantics, and semantic role labeling.
- Homework 10 due
- Reading: likely my notes.
- Optional reading: Automatic labeling of semantic roles; SRL via ILP.
- Optional video
- Vector semantics, latent semantic indexing, neural word embeddings
- Problem set 4 out
- Reading: Vector-space models, sections 1, 2, 4-4.4, 6
- Knowing who's on first. Notes Slides
- Homework 11 due
- Reading: likely my notes
- Option reading: Multi-pass sieve; Large-scale multi-document coreference
- Coherence; discourse connectives; rhetorical structure theory; speech acts.
- Homework 12 due
- Problem set 4 due
- Reading: likely my notes
- Optional: Discourse structure and language technology; Modeling local coherence; Sentence-level discourse parsing
- Learning from the wrong data
- Reading: likely my notes
- Independent project proposal due
- Optional reading:
- Jerry Zhu's survey; Jerry Zhu's book
- Homework 13 due
- Reading: IBM models 1 and 2
- Optional reading: Statistical machine translation
- Reading: Intro to Synchronous Grammars
- Learning to process many languages at once.
- Reading: Multisource transfer of delexicalized dependency parsers
- Optional reading: Cross-lingual word clusters; Climbing the tower of Babel
- Initial result submissions due December 1 at 5pm.
- Homework 14 due
- Optional reading: Semantic compositionality through recursive matrix-vector spaces; Vector-based models of semantic composition
- December 5: Initial project report due at 5PM
- December 11: Final project report due at 5PM