Technische Universität München Robotics and Embedded Systems

Sequence Learning (SS2009)

Veranstalter Alex Graves, Ph.D.
Modul IN8901
Typ Hauptseminar
Semester SS 2009
ECTS 4.0
Vorbesprechung Dienstag, 27.01.2009, 16:45, MI 03.07.023
Zeit & Ort Wird noch bekanntgegeben. Voraussichtlich Blockseminar nach Semesterende.
Schein Erfolgreiche Teilnahme am Seminar.


Students should choose one paper from one of the topics below. Students are advised to make use of the supplementary reading, as well the references in the papers. Papers marked with a * are harder / more mathematical. Note however that the difficulty will be taken into account when awarding grades. Documents marked 'offline' are not available online, but can be borrowed from Alex Graves. Note that you will need to be inside the TU network (where the university subscriptions operate) to download some of the papers.

Topic Papers Background Material
Speech Recognition Paper 1 \ Paper 2  
Handwriting Recognition Paper 1 \ Paper 2 \ Paper 3 Review Paper
fMRI/EEG brain scan processing Paper 1 Paper
Financial Forecasting Florian Trifterer's MSc thesis (offline)  
Natural Language Processing Paper 1 * \ Paper 2 * \ Paper 3 Textbook (offline)
Weather Prediction Paper 1 * Textbook (offline)
Algorithms Paper 1 * \ Paper 2 * \ Paper 3 *  


Sequences of strongly correlated data occur everywhere in nature, from human language to evolving weather patterns. However traditional machine learning techniques are based on the assumption that data points are independent. Sequence learning aims to overcome this limitation by designing architectures and algorithms that reflect the strong dependencies between successive data points.

A good way to start is to examine the generic properties of sequential data, and the problems these pose for conventional machine learning algorithms. Students should then move on to early approaches to sequence learning, such as recurrent neural networks and hidden Markov models, and finally progress to more recent methods, such as sequential graphical models, sequential Gaussian processes and advanced recurrent networks. A list of references will be provided to those interested.


Each student presents one topic, usually based on a single paper or book chapter (although other papers will usually be provided for background reading). The presentation should be about 30 minutes followed by 5 to 10 minutes for questions and discussion. Besides the official presentation, we will also talk about presentation style and slide design. Talk to your advisor at least 2 weeks before your scheduled talk and show him your presentation.


The presenter will give his talk to the whole group, not just the instructors. Guests, who are not joining the seminar but are interested in the topics, are welcome, too. So bring your friends and fans as well. This is a good opportunity to practice your presentation skills in front of a larger audience.


You also must write a summary of your talk. It should be about 10-15 pages. Hand it in by the end of the semester (but better finish your summary before you give your talk, because trying to write things down in your own words will help you realize which parts of the paper(s) are important). We will make the written summaries available to all participants.


In order to get the credits (ECTS/Schein), you must give a presentation, write a summary and attend the seminar meetings (occasional exceptions to the last requirement can be made on an individual basis). The seminar is worth 4 credits (ECTS) or 2 SWS.

Background Material

"Pattern Recognition and Machine Learning" by Christopher Bishop gives an excellent overview of machine learning, and should be read by everyone considering serious research in the field. Chapter 13 gives a brief overview of sequence learning, with particular emphasis on hidden Markov models and linear dynamical systems. Graphical models (an increasingly important tool in sequence learning) are reviewed in Chapter 8, and feedforward neural networks (which form the basis for recurrent neural networks) are covered in Chapter 5. The book is best read cover to cover, since this gives an idea of how the subfields of machine learning are connected. In particular the first four chapters provide essential background on probability theory, information theory, classification and regression.

Alex Graves' Ph.D. thesis gives an introduction to recurrent neural networks, shows how they can be applied to sequence labelling problems, and covers advanced topics such as long short-term memory and multidimensional recurrent networks.