COS 511, Spring 2004: Schedule & Readings

Princeton University
Computer Science Department

Computer Science 511
Foundations of Machine Learning

Rob Schapire

Spring 2004

Directory
General Information | Schedule & Readings | Assignments | Final Project

Schedule and readings

Numbers in brackets under "reading" refer to chapters or sections of Kearns & Vazirani.

#	Date	Topic	Recommended reading
1	Tu 2/3	General introduction to machine learning; consistency model
2	Th 2/5	Consistency model; review of probability
3	Tu 2/10	PAC model; simple special cases; Occam's razor	[1], [2.1-2.2]
4	Th 2/12	Growth function and sample complexity for infinite hypothesis spaces	[3]
5	Tu 2/17	VC-dimension; Sauer's lemma; upper bound on sample complexity
6	Th 2/19	lower bounds on sample complexity; handling inconsistent hypotheses
7	Tu 2/24	Chernoff bounds; McDiarmid's inequality; error bounds for inconsistent hypotheses; overfitting
8	Th 2/26	Boosting; training error bound; naive generalization bounds	Robert E. Schapire. The boosting approach to machine learning: An overview. In MSRI Workshop on Nonlinear Estimation and Classification, 2002. Postscript or gzipped postscript.
9	Tu 3/2	Margins-based analysis of boosting	Robert E. Schapire, Yoav Freund, Peter Bartlett and Wee Sun Lee. Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 26(5):1651-1686, 1998. Postscript or compressed postscript. Slides.
10	Th 3/4	Finish boosting; support-vector machines	Excerpt from The nature of statistical learning theory. Audio demo.
11	Tu 3/9	Finish support-vector machines; learning with expert advice; halving algorithm	Avrim Blum. On-line algorithms in machine learning. In Dagstuhl Workshop on On-Line Algorithms, June, 1996. Gzipped postscript.
12	Th 3/11	Weighted majority algorithm
13	Tu 3/23	Perceptron algorithm; winnow algorithm
14	Th 3/25	Finish winnow; begin regression; linear least squares
15	Tu 3/30	Widrow-Hoff algorithm; EG	Jyrki Kivinen and Manfred K. Warmuth. Additive versus exponentiated gradient updates for linear prediction. Information and Computation, 132(1):1-64, January, 1997. Postscript.
16	Th 4/1	Converting on-line to batch; modeling probability distributions; maximum likelihood
17	Tu 4/6	Maximum entropy modeling of distributions	There are a number of excellent papers and other materials on maximum entropy here. The duality theorem given in class is taken from: Stephen Della Pietra, Vincent Della Pietra and John Lafferty. Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4):380-393, April, 1997. Postscript.
18	Th 4/8	Finish analyzing maximum entropy; on-line log loss and universal compression; begin Bayes algorithm
19	Tu 4/13	Analysis of Bayes algorithm; shifting experts	Mark Herbster and Manfred K. Warmuth. Tracking the Best Expert. Machine Learning, 32(2):151-178, August, 1998. Postscript.
20	Th 4/15	Finish shifting experts; begin portfolio selection
21	Tu 4/20	Universal portfolios algorithm; begin reinforcement learning	Avrim Blum and Adam Kalai. Universal Portfolios With and Without Transaction Costs. Machine Learning, 35:193-205, 1999. Gzipped postscript.
22	Th 4/22	Guest Lecture: Prof. Olga Troyanskaya
23 24	Tu 4/27 Th 4/29	Reinforcement learning (value iteration, policy iteration, Q-learning, TD(lambda))	Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. MIT Press, 2001. See especially Chapters 3, 4 and 6.

Books

Here is a list of optional books for further background reading. All of these are being placed on reserve at the Engineering Library.

Luc Devroye, László Györfi, Gábor Lugosi.
A probabilistic theory of pattern recognition .
Springer, 1996.
Richard O. Duda, Peter E. Hart and David G. Stork.
Pattern classification .
Wiley, 2001.
Trevor Hastie, Robert Tibshirani and Jerome Friedman.
The elements of statistical learning: data mining, inference, and prediction .
Springer, 2001.
Michael J. Kearns and Umesh V. Vazirani.
An introduction to computational learning theory.
MIT Press, 1994.
Tom M. Mitchell.
Machine learning .
McGraw-Hill, 1997.
Vladimir N. Vapnik.
The nature of statistical learning theory .
Springer, 1995.
Vladimir N. Vapnik.
Statistical learning theory .
Wiley, 1998.

Princeton University Computer Science Department

Computer Science 511 Foundations of Machine Learning Rob Schapire

Spring 2004

Schedule and readings

#

Date

Topic

Recommended reading

Books

Princeton University
Computer Science Department

Computer Science 511
Foundations of Machine Learning

Rob Schapire