EE627A: Speech Signal Processing (Spring 2021)
Vipul Arora
Department of Electrical Engineering, IIT Kanpur
Course Videos: YouTube
Course Objectives:
This course will be taught jointly with Prof. Rajesh Hegde. I will be teaching later half focusing on ASR.
This part of the course aims at introducing the students to topics in automatic speech recognition (ASR). The course will deal with concepts involved in building a ASR system. Starting with the conventional methods, it will touch upon the latest deep learning based methods. The Kaldi and open-FST toolkits will be introduced. The lectures will focus on mathematical principles, and there will be coding based assignments for implementation.
Topics:
- Conventional ASR systems
- Gaussian Mixture Models
- Hidden Markov Models
- Finite State Transducers
- Decision Trees
- Kaldi toolkit
- Hybrid HMM-DNN ASR systems
- Deep Neural Networks
- End-to-end ASR systems
- Connectionist Temporal Classification
- Other topics of interest
Lecture Plan:
Wk of 2021 | Week of Sem | Topics |
---|---|---|
11 | Week-9 | Hidden Markov Models |
12 | Week-10 | Finite State Transducers (OpenFST) and Language Models |
13 | Week-11 | GMM-HMM based ASR (HTK book, Kaldi) |
14 | Week-12 | Decision Trees (HTK book, Kaldi) |
15 | Week-13 | Kaldi toolkit |
16 | Week-14 | Neural Networks |
17 | Week-15 | DNN-HMM ASR |
18 | Week-16 | End-to-end ASR |
Grading Scheme
- Project – 20%
Digit recognition using Kaldi ASR toolkit.- Follow https://kaldi-asr.org/doc/kaldi_for_dummies.html
- Prepare your own dataset. Many of you can collaborate to build the dataset.
- The test set will be provided by the instructor.
- Submission:
- 10% for project report (upto 3 pages + 1 extra for references) and presentation (5-8 min, present from report only no need of slides). Use ICASSP template.
- 10% for test set evaluation.
- Bonus (upto 5%) for real time ASR demo.
- End-semester Exam – 30%
Written exam on CodeTantra platform.
Plagiarism Penalty:
As heavy as possible. Zero-tolerance policy.
References:
- “Automatic Speech Recognition: A Deep Learning Approach”, D. Yu and L. Deng, Springer, 2016
- The HTK book
- https://kaldi-asr.org/doc/
- “Pattern Recognition and Machine Learning”, C.M. Bishop, 2nd Edition, Springer, 2011. https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf
- “Deep Learning”, I. Goodfellow, Y, Bengio, A. Courville, MIT Press, 2016. https://www.deeplearningbook.org/