EE627A: Speech Signal Processing (Spring 2021)

Vipul Arora
Department of Electrical Engineering, IIT Kanpur

Course Videos: YouTube

Course Objectives:

This course will be taught jointly with Prof. Rajesh Hegde. I will be teaching later half focusing on ASR.

This part of the course aims at introducing the students to topics in automatic speech recognition (ASR). The course will deal with concepts involved in building a ASR system. Starting with the conventional methods, it will touch upon the latest deep learning based methods. The Kaldi and open-FST toolkits will be introduced. The lectures will focus on mathematical principles, and there will be coding based assignments for implementation.

Topics:

Lecture Plan:

Wk of 2021 Week of Sem Topics
11 Week-9 Hidden Markov Models
12 Week-10 Finite State Transducers (OpenFST) and Language Models
13 Week-11 GMM-HMM based ASR (HTK book, Kaldi)
14 Week-12 Decision Trees (HTK book, Kaldi)
15 Week-13 Kaldi toolkit
16 Week-14 Neural Networks
17 Week-15 DNN-HMM ASR
18 Week-16 End-to-end ASR

Grading Scheme

  1. Project – 20%
    Digit recognition using Kaldi ASR toolkit.
    • Follow https://kaldi-asr.org/doc/kaldi_for_dummies.html
    • Prepare your own dataset. Many of you can collaborate to build the dataset.
    • The test set will be provided by the instructor.
    • Submission:
      • 10% for project report (upto 3 pages + 1 extra for references) and presentation (5-8 min, present from report only no need of slides). Use ICASSP template.
      • 10% for test set evaluation.
      • Bonus (upto 5%) for real time ASR demo.
  2. End-semester Exam – 30%
    Written exam on CodeTantra platform.

Plagiarism Penalty:
As heavy as possible. Zero-tolerance policy.

References: