Machine Learning for Signal Processing: EE603A (Fall 2021)
Vipul Arora
Department of Electrical Engineering, IIT Kanpur
⚠ The focus will be on AUDIO signals
TAs:
Vikas - kvikas@
Sumit - krsumit@
Adhiraj - adhiraj@
Rahul - rkodag@
Ali - alifaraz@
Course Objectives:
This course aims at introducing the students to machine learning (ML) techniques used for various signal processing applications. There will be spectral processing techniques for analysis and transformation of audio signals. The lectures will focus on mathematical principles, and there will be coding based assignments for implementation. Prior exposure to ML is not required. The course will be focused on applications in audio signal processing, and the theory will be tailored towards that end.
Pre-requisites:
- Digital signal processing (EE301A or equivalent)(If you have not done it, you will have to learn it. EE200 may also help.)
- Basics of Programming (ESc101 or equivalent)
The course will need a strong background in linear algebra and probability theory.
Grading:
- Quizzes: 10% × 2 (27 August and 29 October, both Fridays)
- Project: 15%
- Assignments: 5%
- Lecture Notes: 5% (latex document overleaf.com, draw figures in draw.io, submission via github)
- Lecture notes for the topic
- Coding ipynb file explaining the topic with a simple example
- Mind map of entire course (bonus 2%)
- Mid-sem: 25%
- End-sem: 30%
Teams of 2 (project + lecture notes)
Topics:
- Linear Algebra Refresher
- Programming Basics: Python and bash scripting
- Digital Signal Processing for audio
- Probability Theory Refresher
- Machine Learning basics
- Neural Networks
- Music Information Retrieval
- Speech Recognition
- Other applications in audio processing: E.g., acoustic event detection, speaker diarization, music genre classification, auto-tagging, query by humming, melody estimation, etc.
Plagiarism Penalty:
As heavy as possible. Zero-tolerance policy.
References:
This course will take excerpts from some standard books on machine learning and signal processing. But it will largely be based on articles and research papers in ML and SP conferences (e.g., ICASSP, NeurIPS, ICML, Interspeech, ISMIR, etc.) and journals (e.g., IEEE TASLP, JMLR, IEEE PAMI, etc.).
Books:
- “Pattern Recognition and Machine Learning”, C.M. Bishop, 2nd Edition, Springer, 2011.
- “Deep Learning”, I. Goodfellow, Y, Bengio, A. Courville, MIT Press, 2016.
- “An Introduction to Audio Content Analysis”, A. Lerch, Wiley-IEEE Press, 2012.
- “Speech and audio signal processing: processing and perception of speech and music”, B. Gold, N. Morgan, D. Ellis, Wiley, 2011
- “Automatic Speech Recognition: A Deep Learning Approach”, D. Yu and L. Deng, Springer, 2016.
- “Signal Processing Methods for Music Transcription”, A. Klapuri and M. Davy, Springer, 2007.
Articles:
- Hendrik Purwins, Bo Li, Tuomas Virtanen, Jan Schlüter, Shuoyiin Chang, Tara Sainath. “Deep Learning for Audio Signal Processing”, in Journal of Selected Topics of Signal Processing, Vol. 13, No. 2, May 2019, pages 206–219.
- Preeti Rao. “Audio signal processing”, Speech, Audio, Image and Biomedical Signal Processing using Neural Networks. Springer, Berlin, Heidelberg, 2008. 169-189.