COS 511 Foundations of Machine Learning Theory: general information

COS 511 FOUNDATIONS OF MACHINE LEARNING

General Course Information

Lectures: Tues/Thurs 11:00-12:20.

Instructor: Rob Schapire (407 Computer Science, schapire@ , x8-7726).

Office Hours: By appointment or just stop by.

TA: Donna Gabai (205 Computer Science, dgabai@ , x8-1978).

TA Office Hours: Monday 2-3, Tuesday 12:30-1:30, Wednesday 2-3 or by appointment. (Note: No office hours on Mon. 2-17 due to snow.)

Course Description: Machine learning studies automatic methods for learning to make accurate predictions or useful decisions based on past observations. This course introduces the mathematical foundations of machine learning, including theoretical models of machine learning, and the design and analysis of learning algorithms. Topics include: bounds on the number of random examples needed to learn; learning from non-random examples in the on-line learning model (for instance, for investment portfolio selection); how to boost the accuracy of a weak learning algorithm; learning with queries; Fourier-based algorithms; support-vector machines.

Prerequisites: This class has no formal prerequisites, although it is assumed that students will have a general background in computer science (especially theoretical computer science), and an ability to solve mathematical problems rigorously.

Grading and workload: Grading will be based on homework assignments and a final project, with about equal weighting to each. Homeworks will be given roughly on a weekly basis and will each consist of a small number of problems (no programming). For the final project, you can pick any topic you want for further study. Your project could involve implementing an algorithm, running experiments, doing further reading or trying to theoretically extend a result of interest. In all cases, the end product will be a written report. More details will be given later in the semester.

Topics to be covered: Here is a tentative list of topics. If time permits, additional topics may be added. (Bullets do not correspond precisely to lectures.)

General introduction; consistency model.
Basic probability
PAC model; Occam's razor; Chernoff bounds
Geometric concepts; VC-dimension; upper and lower bounds on sample complexity
Boosting and margins theory
Support-vector machines and kernels
Mistake-bounded algorithms; halving algorithm; weighted majority algorithm
Linear-threshold algorithms; perceptron; winnow
On-line regression; EG and WH; Kivinen and Warmuth's framework
Portfolio selection; Cover's algorithm
Query models; Angluin's algorithm
Fourier analysis

(Suggested) Text: There is no textbook that perfectly fits the material in this course. However, the one that comes closest is An Introduction to Computational Learning Theory by Michael Kearns and Umesh Vazirani. I am not requiring this book, but if you want a text, this is the one that I would recommend. The book includes about a third to a half of what I intend to cover. I will also post other papers or notes for topics not in the book.

A list of other books for further background reading appears on the course home page, and are being placed on reserve at the Engineering Library.

Scribe notes: Because there is no perfect textbook for this course, students will be asked to take turns preparing "scribe notes" for posting on the course web site. Each class, one student will be the designated "scribe", taking careful notes during class, writing them up, and sending them to me for posting on the web. Here is more information on how to be a scribe. I will not grade the scribe notes, but I will give credit for doing them (about equal to one problem set).