Fall, 2011

M/W, 11:00AM - 12:20PM

Friend 004

David M. Blei

Piazza Site (discussion and announcements)

We will study some advanced methods in probabilistic modeling that are central to modern machine learning and statistics. We will focus on four subjects:

- posterior inference with variational methods
- hierarchical modeling for grouped data
- model selection, specification, and checking
- Bayesian nonparametric modeling

We will emphasize algorithms and applications as well as the theoretical underpinnings of these subjects.

This course is appropriate for students who have taken COS513 "Foundations of Probabilistic Modeling" or who are familiar with the material from that course. Contact David Blei if you are unsure about whether this is the right course for you to take.

The course will consist of lectures and "practical" lectures. During practical lectures, we will implement and explore the properties of algorithms as a class. (We will learn and use R.)

The requirements are

- Brief reading response papers (less than one page)
- A midterm report and project proposal due Friday October 28
- A final project due Monday January 16 (latex template)
- Class attendance and participation

- For the week of 9/26, read one of
- Blei, D. Graphical models and approximate posterior inference, 2004.
- Jordan, M., Z. Ghahramani, T. Jaakkola, and L. Saul,
An introduction to variational methods for graphical
models.
*Machine Learning*, 37: 183–233, 1999. - Braun, M. and J. McAuliffe,
Variational inference for large-scale models of discrete
choice.
*Journal of the American Statistical Association*, 105: 324–335, 2010.

- For the week of 10/3, read one of
- A paper you didn't read for the week of 9/26.
- Beal, M. and Ghahramani, Z. The variational Bayesian EM
algorithm for incomplete data: with Application to Scoring
Graphical Model Structures.
*Bayesian Statistics 7*, 2002. - Minka, T. Chapter 3:
Expectation propagation. In
*A family of algorithms for approximate Bayesian inference*, PhD Dissertation, 2001. - Wainwright, M. and Jordan, M. Chapter 3: Graphical
models as exponential families In
*Graphical models, exponential families, and variational inference*. Foundations and Trends in Machine Learning, 2008. - Wainwright, M. and Jordan, M. Chapter 5: Mean field
methods In
*Graphical models, exponential families, and variational inference*. Foundations and Trends in Machine Learning, 2008.

- For the week of 10/10, read
- Gelman, A. Carlin, J. Stern,
H. and Rubin, D. "Chapter 5: Hierarchical Models." In
*Bayesian Data Analysis*, Chapman and Hall, 2004.

- Gelman, A. Carlin, J. Stern,
H. and Rubin, D. "Chapter 5: Hierarchical Models." In
- For the week of 10/17, read one of
- Gelman, A. and Hill, J. "Chapter 11: Multi-level
Structures." In
*Data Analysis Using Regression and Multilevel/Hierarchical Models*, Cambridge Press, 2007. - Blei, D., Ng, A. and Jordan, M. Latent
Dirichlet Allocation.
*Journal of Machine Learning Research*, 2003. (Skip the appendix.)

- Gelman, A. and Hill, J. "Chapter 11: Multi-level
Structures." In
- For the week of 10/24, read:
- Efron, B.
Empirical Bayes and the James-Stein Estimator. From
*Large-Scale Inference*, 2010. (Skim Section 1.5.)

- Efron, B.
Empirical Bayes and the James-Stein Estimator. From
- For the week of 11/7, read one of:
- The latent Dirichlet allocation paper from the week of 10/17.
- Pritchard, J., Stephens, M., and Donnelly, P. Inference of
Population Structure Using Multilocus Genotype Data.
*Genetics*. 155:945–959, 2000. - Airoldi, E., Blei, D., Fienberg, S., Xing, E. Mixed Membership
Stochastic Blockmodels.
*Journal of Machine Learning Research*, 9:1981–2014, 2008. - Blei, D. and Lafferty, J. A
Correlated Topic Model of
*Science*,*Annals of Applied Statistics*. 1:1 17–35, 2007.

- For the week of 11/14, read a paper you didn't read from last week.
- For each of the weeks of 11/28, 12/5 read one of
- S. Gershman and D. Blei. A tutorial on Bayesian nonparametric models. Journal of Mathematical Psychology, to appear.
- R. Neal. Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 9[2]:249–265, 2000.
- D. Blei and M. Jordan. Variational inference for Dirichlet process mixtures. Journal of Bayesian Analysis, 1[1]:121–144, 2006.
- J. Sethuraman. A
constructive definition of Dirichlet priors.
*Statistica Sinica*, 4:639–650, 1994. - Y. Teh and M. Jordan Hierarchical Bayesian nonparametric models with applications. In Bayesian Nonparametrics, Cambridge University Press, 2010.
- T. Griffiths and Z. Ghahramani. The Indian buffet process: An introduction and review. Journal of Machine Learning Research ,12:1185–1224, 2011.
- S. Goldwater, T. Griffiths, and M. Johnson. Producing power-law distributions and damping word frequencies with two-stage language models Journal of Machine Learning Research ,12:2335–2382, 2011.

- For the final reading report, read one or more of the following.
- Gelman, A., Meng, X., and Stern, H. Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica,6:733–807, 1996.
- Box, G. Sampling and Bayes' inference in scientific modelling and robustness. Journal of the Royal Statistical Society, Series A (General),143:4,383–430, 1980.
- Rubin, D. Bayesianly justifiable and relevant frequency calculations for the applied statistician. The Annals of Statistics, 4:1151–1172, 1984.
- Gelman, A. and Shalizi, C. Philosophy and the practice of Bayesian statistics. 2010.
- Blei, D. Stochastic variational inference. 2011.

- 9/19: Introduction, course overview, course requirements
- 9/21: Review (graphical models, posterior inference, computation)

- 9/26: Exponential families
- 9/28, 10/3, 10/5: Variational inference
- 10/10: Practical: Bayesian mixture of Gaussians

- [PDF of notes]
- 10/17: Introduction to hierarchical modeling
- 10/19, 10/24: Hierarchical generalized linear models
- 10/26: James-Stein estimation and empirical Bayes

- [PDF of notes]
- 11/7: Introduction to mixed-membership models
- 11/9: Probabilistic topic models
- 11/14: Gibbs sampling in topic models (Guest lecturer: David Mimno)
- 11/16: Variational inference in topic models

- [PDF of notes]
- 11/28: Chinese restaurant processes
- 11/30: Chinese restaurant process mixtures and Gibbs sampling
- 12/5: Gibbs sampling (cont.) and demonstration
- 12/7: Dirichlet processes and random measures
- 12/12: Stick breaking constructions; hierarchical Dirichlet processes processes

- 12/14: Stochastic variational inference

- PDF of notes
- 12/16: Posterior predictive checks