Princeton University

Computer Science 521

Fall 2006 
Design and analysis of algorithms is an important part of computer science today. This course introduces students to advanced techniques for algorithm design and analysis, and explores a variety of applications.
We will devote about a couple of weeks each to several major areas of algorithms research: data structures, online algorithms, maximumflow, linear programming, Markov Chain Monte Carlo (MCMC), algorithms in machine learning, internet algorithms and algorithms for large data sets. Enroute, we will encounter various useful ideas, including randomization, probabilistic analysis, amortized analysis, competitive analysis, eigenvalues, high dimensional geometry, and random walks.
Prerequisite: One undergraduate course in algorithms, or equivalent mathematical maturity. Listeners and auditors are welcome. There will be 56 homeworks during the term, and a takehome final exam. We may read a few research papers and have a few "field trips" (attending interesting talks on cutting edge algorithms research). There is no official text. Instead, we will use assorted online resources.
Professor: Moses
Charikar  305
CS Building  2587447 moses
[AT] cs.princeton.edu
Office hours by appointment
Graduate Coordinator: Melissa Lawson
 310 CS Building  2585387 mml [AT] cs.princeton.edu
CLICK
HERE TO JOIN THE MAILING LIST FOR THIS CLASS.
The take home final will be posted here on Jan 17. You must finish the final 48 hours after first reading it. The last day to submit your solutions is Mon, Jan 22. You can use any notes/handouts from the class and can quote any results from there. You are not allowed to use any other sources.
To get an idea of what the final might be like, take a look at last year's final .
This is the final exam (ps,pdf). Do not read it before you are ready to work on it. Send Moses email to let him know you have started the test in case any clarifications need to be made.
All the best !
Lecture 
Assigned reading 
Optional reading and other links 
1. Universal hashing.
Probability and random
variables. 
First 8.5 pages of Universal Hashing by Peter Bro Miltersen.  
2. Perfect hashing. Bloom Filters.  Additional notes on hashing
(unedited): one,
two Network applications of Bloom Filters by Broder and Mitzenmacher. 

3. Balls and bins, and a few
applications. Chernoff bounds, martingales, AzumaHoeffding inequality.


4. Introduction to competitive
analysis, list update. 
A survey of selforganizing data structures, Albers and Westbrook (pages 117).  
5. Online algorithms contd:
caching  deterministic and randomized algorithms. 
Lecture notes (from Berkeley): deterministic
algorithms (Section 2), and
randomized algorithms. 

6. Online load balancing 
uniform machines, restricted machines and related machines. 
"Online Computation and
Competitive Analysis", Borodin and El Yaniv, Chapter 12,
Sections12.112.3. Note: the proof in the restricted machines case is
not quite correct. See the original proof in the paper of Azar,
Naor and Rom (Section 3.1). 

7. Introduction to max flow. 
Lecture slides (courtesy Kevin Wayne). 

8. Max flow contd  GoldbergRao. 
Notes by Kurt Mehlhorn.  
9. Matchings in bipartite
graphs, Hall's theorem, augmenting paths 
Notes by Santosh Vempala.  
10. Matchings in general graphs,
blossoms. 
see above 

11. Introduction to Linear
Programming and duality. 
Sanjeev Arora's notes on LP duality.  Michel Goeman's notes on linear programming. 
12. (guest lecturer: Satyen
Kale) Intro to algorithms in Machine
learning. Peceptron algorithm. Introduction to Support Vector
Machines (SVMs). 
What is machine learning by Rob Schapire. Andrew Ng's lecture notes on the perceptron algorithm.  
13. (guest lecturer: Satyen
Kale) AI algorithms contd. The multiplicative updates method
and connection to lagrangean relaxation and linear programming. 
The multiplicative weights method: a metaalgorithm and its applications by Sanjeev Arora, Elad Hazan, and Satyen Kale.  
14. Introduction to solution of
LPs: The Ellipsoid method 
Santosh Vempala's notes
on the ellipsoid method. For a discussion of number of
bits of precision needed, see Section 11 of Michel Goeman's notes on
linear programming. Sections 67 in these notes have a description of
the Simplex method which we briefly outlined in class. 
I mentioned the paper of
Spielman and Teng that introduced smoothed analysis as a method for
analyzing the practical behavior of algorithms, inspired by
trying to understand why Simplex works well in practice. Read Section 1
for an overview of work in this area. 
15. Approximation Algorithms
using Linear Programs 
David Williamson's
lecture notes
on various ways to design approximation algorithms for set cover.
Satish Rao's
lecture notes on
minimum congestion routing via LP. 
This
paper showed that the
ln n guarantee for set cover is almost optimal. A recent
paper shows that the RaghavanThompson randomized rounding
procedure for minimum congestion is essentially optimal. 
16. Approximation Algorithms
based on Linear Programs (contd.) 
See David Williamson's notes
above. 

17. Approximation Algorithms
based on Semidefinite Programs 
David Williamson's
lecture notes on SDP based approximation algorithms. 
The second KargerMotwaniSudan
rounding procedure for
graph coloring described in Williamson's notes is slightly different
from the one I showed in class.
See page 11 of the
paper
by Karger Motwani Sudan for a description of this. 
18. Eigenvalues and expansion 
Sanjeev Arora's
notes on eigenvalues and expansion. 

19. Eigenvalues and expansion
(contd) 


20. Randomized Algorithms: Min
Cut 
Notes
on randomized mincut from CMU. 

21. Randomized Algorithms:
Identity testing 
Motwani
and Raghavan, Chapter 7, pages 161172 

22. Parallel and Distributed
Algorithms. Brief introduction: Byzantine agreement and Parallel
Maximal Independent Set. 
Notes
on parallel maximal independent set from Georgia Tech. Motwani and
Raghavan, pages 358361. 

23. Algorithms for Large
Datasets: Streaming Algorithms 
The space
complexity of approximating the frequency moments by Alon, Matias
and Szegedy. 

24. Algorithms for Large
Datasets: Nearest Neighbor Search 

