Princeton University
Computer Science Department

Computer Science 511
Foundations of Machine Learning

Rob Schapire

Spring 2006


General Information | Schedule & Readings | Assignments | Final Project

Schedule and readings

Numbers in brackets under "reading" refer to chapters or sections of Kearns & Vazirani.





1 Tu 2/7 General introduction to machine learning; consistency model scribe notes
2 Th 2/9 Consistency model; review of probability; begin PAC model scribe notes

for a very brief review of probability, see, for instance, Appendix C.2 and C.3 in Introduction to Algorithms by Cormen, Leiserson, Rivest and Stein.

3 Tu 2/14 PAC model; simple special cases; begin Occam's razor scribe notes

[1], [2.1-2.2]

4 Th 2/16 Prove Occam's razor; begin sample complexity for infinite hypothesis spaces scribe notes


5 Tu 2/21 Growth function and sample complexity for infinite hypothesis spaces; beginVC-dimension; Sauer's lemma; upper bound on sample complexity scribe notes


6 Th 2/23 VC-dimension; Sauer's lemma; upper bound on sample complexity; lower bounds on sample complexity scribe notes

[3.2-3.4], [3.6]

7 Tu 2/28 Finish lower bound; handling inconsistent hypotheses; begin Chernoff bounds scribe notes
8 Th 3/2 Finish Chernoff bounds; McDiarmid's inequality; error bounds for inconsistent hypotheses; overfitting; begin boosting scribe notes
9 Tu 3/7 Boosting; training error bound; naive generalization bounds scribe notes

Robert E. Schapire.
The boosting approach to machine learning: An overview.
In D. D. Denison, M. H. Hansen, C. Holmes, B. Mallick, B. Yu, editors, Nonlinear Estimation and Classification. Springer, 2003.
Postscript or gzipped postscript.

10 Th 3/9 Margins-based analysis of boosting scribe notes

Robert E. Schapire, Yoav Freund, Peter Bartlett and Wee Sun Lee.
Boosting the margin: A new explanation for the effectiveness of voting methods.
The Annals of Statistics, 26(5):1651-1686, 1998.
Postscript or compressed postscript.


11 Tu 3/14 Finish boosting; begin support-vector machines scribe notes

Excerpt from Vapnik's The nature of statistical learning theory.

12 Th 3/16 Finish support-vector machines scribe notes
13 Tu 3/28 On-line learning; learning with expert advice; weighted majority algorithm scribe notes

Avrim Blum.
On-line algorithms in machine learning.
In Dagstuhl Workshop on On-Line Algorithms, June, 1996.
Gzipped postscript.

"mind-reading" game based on on-line learning

14 Th 3/30 Finish weighted majority algorithm; perceptron algorithm scribe notes
15 Tu 4/4 Winnow; begin regression scribe notes
16 Th 4/6 linear regression; Widrow-Hoff algorithm scribe notes

Jyrki Kivinen and Manfred K. Warmuth.
Additive versus exponentiated gradient updates for linear prediction.
Information and Computation, 132(1):1-64, January, 1997.

17 Tu 4/11 finish Widrow-Hoff; EG; Converting on-line to batch scribe notes
Th 4/13 (no class)
18 Tu 4/18 modeling probability distributions; maximum likelihood; maximum entropy modeling of distributions scribe notes

There are a number of excellent papers and other materials on maximum entropy here.
The duality theorem given in class is taken from:

Stephen Della Pietra, Vincent Della Pietra and John Lafferty.
Inducing features of random fields.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4):380-393, April, 1997.

19 Th 4/20 finish maxent; begin on-line log loss scribe notes
20 Tu 4/25 on-line log loss and universal compression; Bayes algorithm scribe notes
21 Th 4/27 shifting experts scribe notes

Mark Herbster and Manfred K. Warmuth.
Tracking the Best Expert.
Machine Learning, 32(2):151-178, August, 1998.

22 Tu 5/2 portfolio selection scribe notes

Avrim Blum and Adam Kalai.
Universal Portfolios With and Without Transaction Costs.
Machine Learning, 35:193-205, 1999.
Gzipped postscript.

For an opposing view to this approach, see the following short paper (and try to detect what is unusual about the writing):

Paul A. Samuelson
Why we should not make mean log of wealth big though years to act are long
Journal of Banking and Finance, 3:305-307, 1979.

23 Th 5/4 game theory and learning scribe notes

Yoav Freund and Robert E. Schapire.
Game theory, on-line prediction and boosting.
In Proceedings of the Ninth Annual Conference on Computational Learning Theory, pages 325-332, 1996.
Postscript or compressed postscript.

Or, for more details, see:

Yoav Freund and Robert E. Schapire.
Adaptive game playing using multiplicative weights.
Games and Economic Behavior, 29:79-103, 1999.
Postscript or compressed postscript.



Here is a list of optional books for further background reading. All of these are being placed on reserve at the Engineering Library.