Structured neural models

Summer semester 2019
Prof. Dr. Alexander Koller
Fri 12-14; C72 Meeting Room 2.11

Start: Friday, April 12

Important notice: We currently have almost as many pre-registrations for the course as we have slots for student presentations. If you would like to participate in the class, please email me to clarify if there is room left.

Over the past few years, neural models have set new states of the art across all areas of computational linguistics. Many of these neural models are end-to-end, in that they learn to directly map from an input (e.g. a sentence) to some output representation (POS tags, syntax trees, semantic representations, etc.), without computing an internal representation beyond the activations of the neurons in the hidden layers. Such models, while very effective for many tasks, have certain disadvantages; in particular, they are very data-hungry; they do not always generalize well to unseen data or related tasks; and they make it hard to inject linguistic knowledge when it is available.

In this course, we will look at neural models which can capture structured information, e.g. syntax trees, semantic representations, or dialogue states. Because such information is usually not annotated at large scale, we will in particular focus on neural models which can deal with latent variables, i.e. random variables which are part of the generative process of the neural model, but whose values cannot be observed at training time. We will look at the main approaches that have been developed for this scenario in the past 2-3 years, and will then discuss applications of deep latent-variable models to computational linguistics.

Please consult the schedule for a detailed list of papers. Some of the papers will be presented as talks. Others will be in reading-group style, where we all read the paper and then discuss it together in class.

We will use Piazza to communicate between classes.

Grading. You will give a talk about a paper of your choice (60 minutes). You may also write a seminar paper in the term break.

If you choose to write a seminar paper, your overall grade for the course will consist of 40% talk grade + 40% grade for the seminar paper + 20% grade for the active participation in the in-class discussions and reading group sessions. If you do not write a seminar paper, your grade will be 66% talk grade + 34% participation grade.

This seminar is suitable for advanced BSc students and for MSc students. It will be taught in English.

Prerequisites. This is a course on a very technical subject. It requires familiarity with neural networks (feed-forward and LSTMs) and the theory behind them, e.g. at the level of Klakow’s Neural Networks lecture. Furthermore, the papers we will read are very recent (from the past 2-3 years), and you may have to follow the citations to understand some of their technical background. If this appeals to you, I look forward to working with you!