KAIST GSAI Spring 2022

AI599: Special Topics in Machine Learning : Deep Learning and Real-world Applications

Deep learning is now an integral part of daily systems and tools people use, and therefore no longer is a concern of only academic research. You will get the front-row experience on practical issues in research and development of deep learning systems from the leading experts and researchers. Major course activities include:

  • Reading Response: You'll read and discuss important papers and articles in the field. Each week, there will be 1-2 reading assignments, for which you'll write a short response to.
  • Topic Presentation: Once a semester, you'll lead the class by summarizing the readings, and spurring the in-class discussion.
  • In-class Activities: Each class will feature activities that will help you understand core concepts introduced in the course.

Course Staff

Instructors:
    Prof. Minsuk Chang
    Prof. Dongyoon Han
    Prof. Sangwoo Lee

TAs:
     Sunghyun Park
     Dongmin Choi

Staff Mailing List:
     dl_ai599@navercorp.com
     note: this is a group email address that includes the instructors and the TAs.

Time & Location

When: 10:30am-13:15pm Fri
Where:

Links

Course Website: https://ai599.github.io/spring-2022/
Submission & Grading: KLMS
Discussion Forum: TBD

Updates

  • 3/18: The lecture slide for each class will be provided before the class at KLMS (If there are no restricted contents), and the recorded clips for each class are currently unavailable.
  • 3/4: First day of class!
  • 3/3: Extra enrollment is closed - but spaces might open up if others unregister. If you want to be waitlisted, please fill in this survey. We will enroll you first come first serve once spaces open up.
  • 3/2: You may "audit" or "sit in" this class, but you still have to submit reading responses and actively participate in class activities. If you're interested, please send an email to dl_ai599@navercorp.com .

  • 3/2: We are accepting extra enrollments, but spaces are limited to total of 46 students. If you're interested in taking this class, please send an email to dl_ai599@navercorp.com and fill in this survey. Current headcount: 46/46
  • 2/28: Welcome to the deep learning and real-world applications class! We're still finalizing the schedule and the reading list. Stay tuned!

Schedule

Week Date Topic Invited Speaker Reading (response indicates a reading response is required for one of the two articles.) Due
1 3/4 Introduction & Course Overview + AI research in industry 하정우 please read the updated course syllabus, and please ask any questions you might have.
2 3/11 Representation learning in computer vision
Session 1: Learning representations with evolved model architectures
한동윤 (1) response 1 Tan, Mingxing, and Quoc Le. "EfficientNetV2: Smaller Models and Faster Training.", ICML 2021
(2) response 1 Liu, Ze, et al. "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows.", ICCV 2021

Recommended reading
2 3/11 Representation learning in computer vision
Session 2: Practical scenarios and applications in computer vision
유영준 (1) response 2 An, Xiang, et al. "Partial FC: Training 10 Million Identities on a Single Machine.", ICCV 2021
(2) response 2 Sculley, David, et al. "Hidden technical debt in machine learning systems.", NeurIPS 2015

Recommended reading
3 3/18 Towards reliable machine learning
Session 1: Definition and real examples of shortcut learning
전상혁 (1) response 1 Brendel, et al. "Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet.", ICLR 2019
(2) response 1 Geirhos, et al. "ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness.", ICLR 2019

Recommended reading
3 3/18 Towards reliable machine learning
Session 2: Attempts to mitigate shortcut learning
전상혁 (1) response 2 Madry, et al. "Towards Deep Learning Models Resistant to Adversarial Attacks.", ICLR 2018
(2) response 2 Ganin, et al. "Domain-Adversarial Training of Neural Networks.", JMLR 2016

Recommended reading
4 3/25 Multimodal representation learning
Session 1: Multimodal deep learning
김진화 (1) response 1 Kim, Jin-Hwa, et al. "Bilinear Attention Networks.", NeurIPS 2018
(2) response 1 Anderson, Peter, et al. "Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering.", CVPR 2018

Recommended reading
4 3/25 Multimodal representation learning
Session 2: Vision-and-Language Pre-training
김원재 (1) response 2 Lu, Jiasen, et al. "ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks.", NeurIPS 2019
(2) response 2 Kim, Wonjae, et al. "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision.", ICML 2021

Recommended reading
5 4/1 Noisy Labeling +
Practical scenarios and applications in computer vision
송환준 (1) response 1 Han, Bo, et al. "Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels.", NeurIPS 2018
(2) response 1 Li, Junnan, et al. "DivideMix: Learning with Noisy Labels as Semi-supervised Learning.", ICLR 2020

Recommended reading
5 4/1 Noisy Labeling +
Practical scenarios and applications in computer vision
위동윤 (1) response 2 Feichtenhofer, Christoph, et al. "SlowFast Networks for Video Recognition.", ICCV 2019
(2) response 2 Wang, Xiaolong, et al. "Non-local Neural Networks.", CVPR 2018

Recommended reading
6 4/8 Practical scenarios and applications in computer vision 백영민 (1) response 1 Kittenplon, Yair, et al. "Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer.", arXiv 2022
(2) response 1 Baek, Youngmin, et al. "Character region awareness for text detection.", CVPR 2019

Recommended reading
6 4/8 Practical scenarios and applications in computer vision 이바도 (1) response 2 Cha, Junbum, et al. "Few-shot Compositional Font Generation with Dual Memory.", ECCV 2020
(2) response 2 Park, Song, et al. "Few-shot Font Generation with Localized Style Representations and Factorization.", AAAI 2021

Recommended reading
7 4/15 Generative models 김윤지 (1) response 1 Ji, Xu, et al. "Invariant Information Clustering for Unsupervised Image Classification and Segmentation.", ICCV 2019
(2) response 1 Van, Gansbeke, et al. "SCAN: Learning to Classify Images without Labels.", ECCV 2020

Recommended reading
7 4/15 Generative models 김준호 (1) response 2 Kang, Minguk and Park, Jaesik. "ContraGAN: Contrastive Learning for Conditional Image Generation." NeurIPS 2020
(2) response 2 Liu, Bingchen, et al. "Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis." ICLR 2021

Recommended reading
8 4/22 No Class (Midterm exams)
9 4/29 Voice synthesis and applications
송은우 (1) response 1 Shen, Jonathan, et al. "Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions.", ICASSP 2018
(2) response 1 Ren, Yi, et al. "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech.", ICLR 2021

Recommended reading
9 4/29 Voice synthesis and applications
황민제 (1) response 2 Kumar, Kundan, et al. "MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis.", NeurIPS 2019
(2) response 2 Yamamoto, Ryuichi, et al. "Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram.", ICASSP 2020

Recommended reading
10 5/6 Speech recognition and applications 김한규 (1) response 1 Hsu, Wei-Ning, et al. "HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units.", IEEE/ACM Transactions on Audio, Speech, and Language Processing 2021
(2) response 1 Chung, Yu-An, et al. "W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training.", arXiv 2021

Recommended reading
10 5/6 Speech recognition and applications 정남규 (1) response 2 Culati, Anmol, et al. "Conformer: Convolution-augmented Transformer for Speech Recognition.", Interspeech 2020
(2) response 2 Han, Wei, et al. "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context.", Interspeech 2020

Recommended reading
11 5/13 AutoML and Practical MLOps
김지훈 (1) response 1 Real, Esteban, et al. "AutoML-Zero: Evolving Machine Learning Algorithms From Scratch.", ICML 2020
(2) response 1 Falkner, Stefan, et al. "BOHB: Robust and Efficient Hyperparameter Optimization at Scale.", ICML 2018

Recommended reading
11 5/13 AutoML and Practical MLOps
서동필 No reading this week
12 5/20 NLP, Dialogues, and QA
이상우 (1) response 1 Devlin, et al. "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.", NAACL 2019.
(2) response 1 Raffel, et al. "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.", JMLR 2020.

Recommended reading
12 5/20 NLP, Dialogues, and QA
김성동 (1) response 2 Roller, Stephen, et al. "Recipes for building an open-domain chatbot.", EACL 2021
(2) response 2 Lewis, Patrick, et al. "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks", NeurIPS 2020

Recommended reading
13 5/27 Hyperscale LM & NLP applications 이기창 (1) response 1 Brown, et al. "Language Models are Few-Shot Learners.", NeurIPS 2021
(2) response 1 Rae, et al. "Scaling Language Models: Methods, Analysis & Insights from Training Gopher.", arXiv 2021.

Recommended reading
13 5/27 Hyperscale LM & NLP applications 유강민 (1) response 2 Lester, Brian, et al. "The Power of Scale for Parameter-Efficient Prompt Tuning.", EMNLP 2021
(2) response 2 Li, Xiang Lisa, and Percy, Liang. "Prefix-Tuning: Optimizing Continuous Prompts for Generation.", arXiv 2021

Recommended reading
14 6/3 Human-centric NLP 이화란 (1) response 1 Dinan, Emily, et al. "Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation.", EMNLP 2020
(2) response 1 Perez, Ethan, et al. "Red Teaming Language Models with Language Models.", arXiv 2022.

Recommended reading
14 6/3 Human-centric NLP 정준영, 이민아 (1) response 2 Chung, JJY, et al. "TaleBrush: Sketching Stories with Generative Pretrained Language Models.", CHI 2022
(2) response 2 Lee, Mina, Percy Liang, and Qian Yang. "CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities.", CHI 2022

Recommended reading
15 6/10 Large-scale user modeling and its applications 곽하녹 (1) response 1 Shin, et al. "Scaling Law for Recommendation Models: Towards General-purpose User Representations", arXiv 2021
(2) response 1 Shin, et al. "One4all user representation for recommender systems in e-commerce", arXiv 2021

15 6/10 Large-scale user modeling and its applications 정지수 (1) response 2 Hsieh, et al. "Collaborative Metric Learning.", WWW 2017
(2) response 2 Kim, Boseop, et al. "What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers.", EMNLP 2021

Recommended reading
16 6/17 No Class (Final exams)

Topics (tentative)

Major topics include:
  • Representation Learning
  • Reliable ML
  • Voice and Speech
  • NLP
  • MLOps
  • Recommendation systems

Grading

  • Attendance: 20%
  • Reading responses: 40%
  • Topic presentation: 20%
  • Class participation: 10%
  • Quizes: 10%
Late policy: Three lowest reading response grades will be removed. Each quiz score will be normalized identical to the score of 5 questions asked in the lessons. No late submissions are allowed for the reading responses.

Prerequisites

There are no official course prerequisites. But assignments involve a lot of reading, research experience in machine learning is useful, but not required.