Theoretical Machine Learning: Home Page

COS 511
Theoretical Machine Learning
Elad Hazan

Spring 2016

Course Summary

Can the mechanism of learning be automated and implemented by a machine?
In this course we formally define and study various models that have been proposed for learning. The course will present and contrast the statistical, computational and game-theoretic models for learning. We will present and rigorously analyze some of the most successful algorithms in machine learning that are extensively used today. Likely topics include: intro to statistical learning theory and bounds on the number of random examples needed to learn; learning in adversarial settings and the on-line learning model; using convex optimization to model and solve learning problems; learning with partial observability; how to boost the accuracy of a weak learning algorithm; universal portfolio selection; learning in games; unsupervised learning; spectral methods in learning.

Administrative Information

Lectures: Tue + Thu, 11:00-12:20 in CS building, room 105.

Professor: Elad Hazan, CS building 407, Reception hours: Tue 14-15, Fri 10:30-11:30, or by appointment.

Teaching Assistants:
Brian Bullins, CS building second floor, Reception hours: Wednesdays 14:00-15:00
Jason Altschuler , CS building tea room, Reception hours: Fridays 14:30-16:00

Requirements: This is a graduate-level course that requires significant mathematical background.
Required background: probability, discrete math, calculus, analysis, linear algebra, algorithms and data structures, theory of computation / complexity theory
Recommended: linear programming, mathematical optimization, game theory

Attendance and the use of electronic devices: Attendance is expected at all lectures. The use of laptops and similar devices for note-taking is permitted.

Textbook and readings

Textbooks:
1. (draft) Introduction to Online Convex Optimization, by E. Hazan, available here
2. An Introduction To Computational Learning Theory, by M.J. Kearns and U. Vazirani
3. Prediction, Learning and Games, by N. Cesa-Bianchi and G. Lugosi
4. Understanding Machine Learning: From Theory to Algorithms, by Shai Shalev-Shwartz and Shai Ben-David
5. Boosting: Foundations and Algorithms, by R. E. Schapire and Y. Freund
6. Foundations of Machine Learning, by Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar

We'll post additional optional reading as the course progresses.

Grading and collaboration

Grading: homework assignments (60%) and a final take home exam (40%). There will be 6-8 homework assignments, which will be given roughly every week or two, and will each consist of a small number of problems (optional programming). Credit for all assignments is the same.
Additional credit is given for constructive participation in class and solution of occasional "bonus problems" presented in class. Points may be deducted for extensive no-show to class.
Turning in assignments and late policy: Coordinate submission of assignments with the TA. Every late day reduces the attainable credit for the exercise by 10%.
Scribe notes: These are optional, but are graded and give extra credit equal to one additional exercise. Every student that is interested will have the option of scribing.
Collaboration: Collaboration on the problem sets is allowed. Final exam is individual. The people you collaborated with on assignments should be clearly detailed: before the solution to each question, list all people that you collaborated with on that particlar question.

Lecture notes and schedule

Student taking or auditing the course can volunteer to write scribe notes for a lecture using the latex system. Latex tutorials: tutorial 1 and tutorial 2.
Relevant style files appear here and definitions here. Scribes should strive for good mathematical style. A few pages from Mathematical Writing by by Knuth, Larrabee, and Roberts provide some good hints.

Number	Topic	Lecture notes	Extra reading	Problem sets
1	Introduction to Learning Theory Statistical Learning model		chapters 1-5 in book 4
2	No free lunch learning finite hypothesis classes proof by probabilistic method	lecture2	chapters 1-5 in book 4 The probabilisitc method, Alon and Spencer lecture notes on prob method	homework 1
3	Learning by optimization Introduction to convex optimization	lecture3	chapter 2 in book 1 Nemirovski book
4	Iterative methods Gradient Descent		chapter 2 in book 1 Bubeck book
5	Reductions in Optimization Introduction to regret minimization		chapters 2,3 in book 1 survey
6	Modelling learning problems as OCO Online Gradient Descent		chapter 3 in book 1 Zinkevich's paper
7	From OCO to statistical learning Online 2 Batch		chapter 9 in book 1 Cesa-Bianchi et al.	homework 2
8	Stochastic gradient descent Optimization for machine learning		chapters 2,3 in book 1
9	Second order methods Portfolio selection	lecture 9	chapters 3,4 in book 1 Cover's paper simpler analysis here
10	Special research talk (Brian Bullins)		minimizing regret
11	Second order methods (cont.)		chapter 4 book 1
12	Regularization Decision making		chapters 5,1 book 1 MW survey	homework 3
13	Regularization cont. RFTL		competing in the dark Shalev survey
14	Adaptive Regularization AdaGrad	lecture14	AdaGrad paper
15	Boosting	lecture15	Schapire survey my notes	homework 4
16	VC theory Characterization of (boolean) statistical learning	lecture16	slides books 2,4,6
17	VC theory - cont. Rademacher complexity	lecture17	book 6 chapter 3
18	multi-armed bandits bandit convex optimization	lecture18	Flaxman, Kalai, Mcmahan book 1 chapter 6	homework 5
19	bandit convex and linear optimization	lecture19	competing in the dark book 1 chapter 6
20	Introduction to Bayesian learning (Barbara Englehardt)
21	Games, Duality and Regret	lecture 21	Adler convergence of regret minimization dynamics in concave games Book 1 chapter 8
22	Matrix prediction, recommendation systems		OCG algorithm & implementation Candes-Recht reconstruction approach Book 1 chapter 7
23	Reinforcement learning	lecture 23	adversarial MDPs online MDPs
24	Ask me anything

COS 511 Theoretical Machine Learning Elad Hazan