Research
I am an assistant professor in the Computer Science department of Princeton University. (See my CV.)
My Ph.D. advisor was Michael Jordan at U.C. Berkeley Computer Science. I was a postdoctoral researcher with John Lafferty at CMU in the Machine Learning department.
My research interests include:
- Probabilistic graphical models and approximate posterior inference
- Topic models, information retrieval, and text processing
- Nonparametric Bayesian statistics
Graduate Students
Courses
- COS424: Interacting with data, Spring 2008
- COS597C: Bayesian nonparametrics, Fall 2007
- COS424: Interacting with data, Spring 2007
- COS402: Introduction to artificial intelligence, Fall 2006
- COS598C: Approximate inference in graphical models, Spring 2006
Topic Modeling Code and Browsers
Much of my research is in topic modeling, building hierarchical probabilistic models of documents and other media to uncover latent structure in their contents. As an example of this research, here are slides from a recent talk on dynamic and correlated topic models applied to the journal Science . (Here is a video of the talk.)
The structure uncovered by topic models can be used to explore the otherwise unorganized collection: dividing documents according to their topics and using the hidden structure to determine similarity between documents. The following are browsers of large collections of documents built with topic models:
- A 50-topic browser of the 2006 arXiv.
- A 20-topic browser of The American Political Science Review
- A 100-topic browser of Science (1980-2000)
Publications
- D. Blei, J. McAuliffe. Supervised topic models. In Advances in Neural Information Processing Systems 21, 2007. (PDF)
- J. Boyd-Graber, D. Blei, and X. Zhu. A topic model for word sense disambiguation. In Empirical Methods in Natural Language Processing, 2007. (PDF)
- W. Li, D. Blei, and A. McCallum. Nonparametric Bayes pachinko allocation. In The 23rd Conference on Uncertainty in Artificial Intelligence, 2007. (PDF)
- D. Kaplan and D. Blei. A computational approach to style in American poetry. In IEEE Conference on Data Mining, 2007.
- D. Blei and J. Lafferty. A correlated topic model of Science. Annals of Applied Statistics. 1:1 17–35. (PDF) (shorter version from NIPS 18) (code)(browser)
- M. Dudik, D. Blei, and R. Schapire. Hierarchical maximum entropy density estimation. Proceedings of the 24th International Conference on Machine Learning, 2007. (PDF)
- D. Blei and J. Lafferty. Dynamic topic models. In Proceedings of the 23rd International Conference on Machine Learning, 2006. (PDF)
- J. McAuliffe, D. Blei, and M. Jordan. Nonparametric empirical Bayes for the Dirichlet process mixture model. Statistics and Computing, 16(1):5–14, 2006. (Springer) (TR PDF)
- D. Blei and M. Jordan. Variational inference for Dirichlet process mixtures. Journal of Bayesian Analysis, 1(1):121–144, 2006. (A shorter version appeared in ICML 2004). (PDF)
- Y. Teh, M. Jordan, M. Beal, and D. Blei. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 2006. 101(476):1566-1581. (PDF) (Teh's code)
- T. Griffiths, M. Steyvers, D. Blei, and J. Tenenbaum. Integrating topics and syntax. In Advances in Neural Information Processing Systems 17, 2005. (PDF)
- D. Blei. Probabilistic Models of Text and Images. PhD thesis, U.C. Berkeley, Division of Computer Science, 2004. (PDF)
- D. Blei and M. Jordan. Modeling annotated data. In Proceedings of the 26th annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 127–134. ACM Press, 2003. (PDF)
- D. Blei, T. Griffiths, M. Jordan, and J. Tenenbaum. Hierarchical topic models and the nested Chinese restaurant process. In Neural Information Processing Systems (NIPS) 16, 2003. (PDF)
- K. Barnard, P. Duygulu, N. de Freitas, D. Forsyth, D. Blei, and M. Jordan. Matching words and pictures. Journal of Machine Learning Research, 3:1107–1135, 2003. (PDF)
- D. Blei, A. Ng, and M. Jordan. Hierarchical Bayesian models for applications in information retrieval. In J. Bernardo, J. Berger, A. Dawid, D. Heckerman, A. Smith, and M. West, editors, Bayesian Statistics 7, volume 7, pages 25–44. Oxford University Press, 2003.
- D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, January 2003. (A shorter version appeared in NIPS 2002). (PDF) (code)
- D. Blei, J. Bagnell, and A. McCallum. Learning with scope, with application to information extraction and classification. In Uncertainty in Artificial Intelligence: Proceedings of the Eighteenth Conference (UAI-2002), pages 53–60, San Francisco, CA, 2002. Morgan Kaufmann Publishers.
- D. Blei and P. Moreno. Topic segmentation with an aspect hidden Markov model. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 343–348. ACM Press, 2001. (PDF)
- D. Blei and L. Kaelbling. Shortest paths in a dynamic uncertain domain. In IJCAI Workshop on Adaptive Spatial Representations of Dynamic Environments, 1999. (PDF)