Speech Recognition: Lecture [H02A6a] (2025-2026 Second semester)

LECTURES

12 lectures of 2 hours each
These are classical lectures in which we strive for a high degree of interaction
CONTENT:
PART I: Understanding speech in time-frequency
PART II: Essential methods used in speech recognition (DTW, HMM)
PART III: Deep Neural Networks based Speech Recognition

Dirk Van Compernolle and Vipul Arora

6 hands-on sessions of 2,5 hrs each, roughly with a 1-week delay with respect to the corresponding lecture in a PC lab @ ESAT.
The exercises include listening tests, computational exercises, design tests, etc.
Some questions need to be solved by hand, but we mostly work in Jupyter notebooks on Colab.
Solutions (partial) are provided 1-2 weeks after the exercises.

All exercise sessions are organised in an ESAT PC Lab in groups of ~30 students. Two sessions per week - a student attends only one.
The ECTS time schedule tends to assign students to group A or B, and you see only one of the slots in your schedule.
For us, it is NOT important when you come, as long as there is no overflow problem in the rooms.
While you can do the exercises at home on your own, it is recommended to attend the exercise sessions for discussions and feedback.
TAs are available to answer your questions DURING the exercise sessions.

Students MUST have a basic knowledge of Machine Learning principles, i.e. a solid linear algebra background and working knowledge of basic statistics (Bayesian) and information theory.
Practically speaking, students should have followed (and passed) the course on Machine Learning in the first semester or should have acquired equivalent basic knowledge elsewhere.

Date	Topics	Video links
03/03/2026	Sequence Recognition, FramewiseASR, DTW, FST, probth, graphical models	[video]
10/03/2026	HMMs for Speech: Viterbi Alignment and Recognition	[video]
17/03/2026	Context Dependent Models, strengths and weaknesses of HMMs	[video]
24/03/2026	HMM/DNN, loss functions, optimization.	[video]
31/03/2026	Modeling sequential data with ANNs.	[video-1], [video-2], [video-3]
01/04/2026	Language models for ASR	[video]
21/04/2026	End-to-end 1: CTC	[video]
28/04/2026	End-to-end 2: RNN-T	[video]
12/05/2026	Self-supervised learning and adaptation	[video]