Date |
Topic |
Reading |
Assignments |
15 Oct |
(no class - introductory week)
|
|
|
18 Oct |
(no class - introductory week)
|
|
|
22 Oct |
Introduction
|
|
|
25 Oct |
Crash course in Python and NLTK
|
|
A1
|
29 Oct |
Generative statistical models; n-grams
|
-
Jurafsky & Martin,
Chapters 4.1-4.2 ("N-grams")
-
*
Kevin Murphy,
Binomial and multinomial distributions
-
*
Jurafsky & Martin (Smoothing),
Chapters 4.5-4.7
-
*
Manning & Schütze (Smoothing),
Chapters 6.2-6.3
|
|
01 Nov |
(holiday - All Saints)
|
|
|
05 Nov |
Hidden Markov Models
|
-
Jurafsky & Martin,
Chapters 5.1-5.5 (POS tagging), 6.1-6.4 (HMMs)
|
A1 due;
A2
|
08 Nov |
Training HMMs
|
-
Jurafsky & Martin,
Chapters 5.7-5.8 (POS tagging evaluation), 6.5 (Forward-Backward)
-
*
Jason Eisner,
HMM-in-a-spreadsheet
|
|
12 Nov |
Discussion of Assignment 1
|
|
|
15 Nov |
Context-free grammars
|
|
|
19 Nov |
Complexity of algorithms and the CKY parser
|
|
A2 due;
A3
|
22 Nov |
Probabilistic CFGs
|
-
Jurafsky & Martin,
Chapter 14.1-14.5, 14.7
|
|
26 Nov |
Discussion of Assignment 2
|
|
|
29 Nov |
Training PCFGs
|
|
|
03 Dec |
(Kathrin Passig talk)
|
|
|
06 Dec |
More accurate PCFG parsing
|
-
Mark Johnson (1998),
PCFG Models of Linguistic Tree Representations
(esp. on parent annotations)
-
Michael Collins,
Lexicalized PCFGs
-
Dan Klein & Chris Manning (2003),
Accurate unlexicalized parsing
-
*
Michael Collins (1997),
Three Generative, Lexicalized Models for Statistical Parsing
-
*
David Magerman (1995),
Description of head percolation tables
-
*
Slav Petrov, Leon Barrett, Romain Thibaux, and Dan Klein (2006),
Learning accurate, compact, and interpretable tree annotation
-
William Morgan,
Statistical Hypothesis Tests for NLP
(explains approximate randomization)
-
Berg-Kirkpatrick et al. (2012),
An empirical investigation of statistical significance for NLP
(explains bootstrap testing)
|
A3 due;
A4
|
10 Dec |
Advanced PCFG parsing algorithms
|
-
Joshua Goodman (1999),
Semiring parsing
(not Section 4, 6; ignore Earley parser if you like)
-
Charniak et al. (2006),
Multilevel coarse-to-fine PCFG parsing
-
Dan Klein & Chris Manning (2003),
A* Parsing - Fast exact Viterbi parse selection
-
*
Stuart Shieber, Yves Schabes, and Fernando Pereira (1993),
Principles and implementation of deductive parsing
-
*
Eugene Charniak, Sharon Goldwater, and Mark Johnson (1998),
Edge-based best-first chart parsing
-
*
Liang Huang and David Chiang (2005),
Better k-best parsing
-
*
Michael Collins and Terry Koo (2005),
Discriminative reranking for natural language parsing
-
*
Eugene Charniak and Mark Johnson (2005),
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking
|
|
13 Dec |
Discussion of Assignment 3
|
|
|
17 Dec |
Dependency parsing
|
-
Ryan McDonald, Fernando Pereira, Kiril Ribarov, Jan Hajic (2005),
Non-projective Dependency Parsing using Spanning Tree Algorithms
-
Joakim Nivre (2008),
Algorithms for Deterministic Incremental Dependency Parsing
-
*
Joakim Nivre, Ryan McDonald (2007),
Characterizing the Errors of Data-Driven Dependency Parsing Models
-
*
Jason Eisner (1996),
Three New Probabilistic Models for Dependency Parsing -- An Exploration
-
*
Eliyahu Kiperwasser and Yoav Goldberg (2016),
Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations
-
*
Google,
SyntaxNet
|
|
20 Dec |
Statistical machine translation - Alignments
|
|
A4 due;
A5
|
24 Dec |
(Christmas break)
|
|
|
27 Dec |
(Christmas break)
|
|
|
31 Dec |
(Christmas break)
|
|
|
03 Jan |
(Christmas break)
|
|
|
07 Jan |
Statistical machine translation - Translation
|
|
|
10 Jan |
Discussion of Assignment 4
|
|
|
14 Jan |
More expressive grammar formalisms
|
|
|
17 Jan |
Bayesian methods - LDA
|
|
A5 due;
A6
|
21 Jan |
Bayesian methods - Grammar induction
|
|
|
24 Jan |
Discussion of Assignment 5
|
|
|
28 Jan |
Semantic parsing
|
|
|
31 Jan |
Lexical semantics
|
-
Jurafsky & Martin,
Chapters 19.1-19.3, 20.6-20.8
-
*
Mikolov et al. (2013),
Efficient estimation of word representations in vector space
-
*
Jeff Mitchell & Mirella Lapata (2008),
Vector-based models of semantic composition
-
*
Marco Baroni & Roberto Zamparelli (2010),
Nouns are vectors, adjectives are matrices
-
*
Yehoshua Bar-Hillel (1960),
A demonstration of the nonfeasibility of fully automatic high quality translation
-
*
Fellbaum et al.,
Wordnet website
-
*
Peters et al. (2018),
Deep contextualized word representations
-
*
Devlin et al. (2019),
BERT: Pre-training of deep bidirectional transformers for language understanding
|
A6 due
|
04 Feb |
Discussion of Assignment 6
|
|
|
07 Feb |
Discussion of final projects
|
|
|