What can we learn without labels? Unsupervised Learning is a subfield of Machine Learning, focusing on the study of mechanizing the process of learning without feedback or labels. This is commonly understood as "learning structure". In this course we'll survey, compare and contrast various approaches to unsupervised learning that arose from difference disciplines, including:
1. Generative and Bayesian models
2. Deep unsupervised models
3. Frequency estimation
4. Information theoretic approaches including compression, rate distortion theory
5. Recent incorporation of PAC learning into compression theory
Professor: Elad Hazan, CS building 407, Reception hours: Wed 10-11, or by appointment.
Requirements: This is a graduate-level course that requires significant
Required background: probability, discrete math, calculus, analysis, linear algebra, algorithms and data structures, theory of computation / complexity theory, undergraduate learning theory (402/424)
Recommended: linear programming, mathematical optimization, game theory, theoretical machine learning (511)
Attendance and the use of electronic devices: Attendance is expected at all lectures. The use of laptops and similar devices for note-taking is permitted.
Grading: This is a seminar course. Students will be assigned papers to study, and present in class. The grade will be based on the presentation of the studied material, as well as comprehension and participation in class.
1. Online EM algorithm for unsupervised learning, by Liang and Klein
2. A unifying view of linear Gaussian models, Rowies and Ghahramani
3. Building high level features using unsupervised learning, Le et al. here
4. Unsupervised Learning of Video Representations using LSTMs, Srivastava et al. here
5. Teaching and compressing for low VC dimension, Shay Moran, Amir Shpilka, Avi Wigderson, and Amir Yehudayoff
6. A spectral algorithm for learning HMMS, Hsu, Kakade and Zhang
7. Hristo S Paskov, Robert West, John C Mitchell, and Trevor Hastie. Compressive feature learning (NIPS 2013)
8. Jayadev Acharya, Hirakendu Das, Ashkan Jafarpour, Alon Orlitsky, and Ananda Theertha Suresh. Tight bounds for universal compression of large alphabets
9. Learning Deep Generative Models, Salakhutdinov
10. Hochreiter and K. Obermayer. Optimal Kernels for Unsupervised Learning, 2005
11. Rate distortion theory and the BA algorithm
12. Tight bounds for universal compression of large alphabets, Acharya et. Al.
13. Generative adversarial networks
1. 8 Feb, Lecture 1: intro (notes by Eric Naslund)
2. 15 Feb, Lecture 2: more on this paper
3. 22 Feb. Davit Buniatyan – Unsupervised Learning of Video Representations using LSTMs. Presentation.
4. 1 March, Alexander Upjohn Beatson – Variational autoencoders: Auto-encoding Variational Bayes https://arxiv.org/pdf/1312.6114.pdf or Stochastic Backpropogation and Approximate Inference in Deep Generative Models (https://arxiv.org/pdf/1401.4082.pdf
5. 8 March, Nikunj Umesh Saunshi – A unifying view of linear Gaussian models, Rowies and Ghahramani
6. 15 March, Wei Hu – A Spectral Algorithm for Learning Hidden Markov Models, spectral methods in unsupervised learning
7. 4 April, Ghassen Jerfel - An Information Theoretic Interpretation of Variational Inference based on the MDL Principle and the Bits-Back Coding Scheme. Notes from the talk.
9. 26 April Kiran Vodrahalli - rate distortion theory, notes
10. 3 May Yi Zhang – Generative Adversarial Network, slides