Topic Modeling Code and Browsers

Much of my research is in topic modeling, building hierarchical probabilistic models of documents and other media to uncover latent structure in their contents. As an example of this research, here are slides from a recent talk on dynamic and correlated topic models applied to the journal Science . (Here is a video of the talk.) John Lafferty and I wrote a review paper about this field.

The structure uncovered by topic models can be used to explore the otherwise unorganized collection: dividing documents according to their topics and using the hidden structure to determine similarity between documents. The following are browsers of large collections of documents built with topic models:

Also see Sean Gerrish's discipline browser for an interesting application of topic modeling at JSTOR.

The topic models mailing list is a good forum for discussing topic modeling.

I maintain some code for topic modeling. Questions, comments, and suggestions about this code should be posted to the topic models mailing list.

There is other code that I do not maintain, but want to post. That is to say: I don't have time to answer questions about this code, but I hope that you will find it useful.

Some of my students have released code as well: