Theoretical Machine Learning: Home Page

COS 511
Theoretical Machine Learning
Roi Livni

Spring 2017

Course Summary

Can the mechanism of learning be automated and implemented by a machine?
In this course we formally define and study various models that have been proposed for learning. The course will present and contrast the statistical, computational and online models for learning. We will present and rigorously analyze some of the most successful algorithms in machine learning that are extensively used today. Likely topics include: intro to statistical learning theory and bounds on the number of random examples needed to learn; learning in adversarial settings and the on-line learning model; using convex optimization to model and solve learning problems; learning with partial observability; how to boost the accuracy of a weak learning algorithm; universal portfolio selection; unsupervised learning; spectral methods in learning.

Administrative Information

Lectures: Thu, 13:30-14:50 CS Building, 105.
Wed, 16:30-17:50 Friend Center, 008.

Professor: Roi Livni, CS building 219, Reception hours: Wed 11:00-12:00, or by appointment.

Teaching Assistants: Kiran Vodrahalli, CS building 402 , Reception hours: Mon 15:00-16:30, or by appointment.

Requirements: This is a graduate-level course that requires significant mathematical background.
Required background: probability, discrete math, calculus, analysis, linear algebra, algorithms and data structures, theory of computation / complexity theory
Recommended: linear programming, mathematical optimization, game theory

Attendance and the use of electronic devices: Attendance is expected at all lectures. The use of laptops and similar devices for note-taking is permitted.

Textbook and readings

Textbooks:
1. (draft) Introduction to Online Convex Optimization, by E. Hazan, available here
2. An Introduction To Computational Learning Theory, by M.J. Kearns and U. Vazirani
3. Prediction, Learning and Games, by N. Cesa-Bianchi and G. Lugosi
4. Understanding Machine Learning: From Theory to Algorithms, by Shai Shalev-Shwartz and Shai Ben-David
5. Boosting: Foundations and Algorithms, by R. E. Schapire and Y. Freund
6. Foundations of Machine Learning, by Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar

We'll post additional optional reading as the course progresses.

Grading and collaboration

Grading: homework assignments (65%) and a final take home exam (35%). There will be 6-8 homework assignments, which will be given roughly every week or two, and will each consist of a small number of problems (optional programming). Credit for all assignments is the same.
Additional credit is given for constructive participation in class and solution of occasional "bonus problems" presented in class. Points may be deducted for extensive no-show to class.
Turning in assignments and late policy: Coordinate submission of assignments with the TA. Every late day reduces the attainable credit for the exercise by 10%.

Collaboration: Collaboration on the problem sets is allowed. Final exam is individual. The people you collaborated with on assignments should be clearly detailed: before the solution to each question, list all people that you collaborated with on that particlar question.

Number	Topic	Lecture notes	Extra reading	Problem sets
1	Introduction to Learning Theory The PAC Model	Lecture 1	Chapters 1-5 in book 4
2	Agnostic PAC, ERM Learnability of Finite Classes	Lecture 2	Chapters 1-5 in book 4
3	VC dimension The Fundemental Theorem	Lecture 3	Chapters 1-5 in book 4
4	No Free Lunch Sauer's Lemma	Lecture 4	Chapters 1-5 in book 4 The probablistic method -- N. Alon & J. Spencer	Exercise 1
5	Proof of Fund. Thm. Neural Networks	Lecture 5	Chapters 1-5, and 20 in book 4 The probablistic method -- N. Alon & J. Spencer
6	Neural Networks. Expressiveness and Sample Complexity	Lecture 6	Chapter 20 in book 4 Neural Network Learning: Theoretical Foundations --M. Anthony & P. Bartlett	Exercise 2
7	Generalized PAC Model Computational Learning Theory	Lecture 7	Chapter 8 in book 4 Santosh Vempala's lecture notes on LP:The Ellipsoid Algorithm
8	Perceptron Algorithm Hardness of Learning	Lecture 8	A, Klivans A. Sherstov (2009) Cryptographic Hardness of Learning Intersections of Halfspaces
9	Boosting	Lecture 9	Y. Freund; R. Schapire (1997). "A decision-theoretic generalization of on-line learning and an application to boosting"
10	Boosting Cont Convex Learning Problems	Lecture 10	Book 5
11	Surrogate Loss Rademacher Complexity	Lecture 11	Cahpter 26 in Book 4	Exercise 3
12	Rademacher Bounds and Calculus	Lecture 12	Chapter 26 in Book 4
13	The Sample Complexity of Linear Classes Stochastic Gradient Descent	Lecture 13	Chapter 26 in Book 4 S. Shalev Shwatz, O. Shamir, N. Srebro, K. Sridharan (2009). "Stochastic Convex Optimization"
14	Analysis of SGD	Lecture 14
15	Kernel Methods	Lecture 15		Exercise 4
16	Neural Network 2 & Backpropogation	Lecture 16
17	Online Learning	Lecture 17
18	Expert Advice (Hedge Algorithm) Online to Batch	Lecture 18
19	Strong Convexity (Logarithmic Regret) Second Order Methods	Lecture 19
20	Online Newton Step Analysis	Lecture 20
21	Follow The Regulerized Leader	Lecture 21		Exercise 5
22	Learning with Partial Feedback	Lecture 22
23	Bandit Convex Optimization	Lecture 23

COS 511 Theoretical Machine Learning Roi Livni