Hierarchical Bayesian Modeling: Efficient Inference and Applications
Appropriate tools for managing large-scale data, like online texts,
images and user profiles, are becoming increasingly important.
Hierarchical Bayesian models provide a natural framework for building
these tools due to their flexibility in modeling real-world data. In
this thesis, we describe a suite of efficient inference algorithms and
novel models under the hierarchical Bayesian modeling framework.
We first present a novel online inference algorithm for the
hierarchical Dirichlet process. The hierarchical Dirichlet process
(HDP) is a Bayesian nonparametric model that can be used to model
mixed-membership data with a potentially infinite number of
components. Our online variational inference algorithm is easily
applicable to massive and streaming data and significantly faster than
traditional inference algorithms.
Second, we present a generic approximation framework for variational
inference in a large family of nonconjugate models. For example, this
includes multi-level logistic regression/generalized linear models and
correlated topic models. With this, developing variational inference
algorithm for many nonconjugate models is much easier.
Finally, we describe two novel models for real-world applications.
This first application is about simultaneous image classification and
annotation. We show that image classification and annotation can be
integrated together using the same underlying probabilistic model. The
second application is to better disseminate scientific information
using recommendations. Compared with traditional recommendation
algorithms, our algorithm not only improves the recommendation
accuracy, but also provides interpretable structure of users and
scientific articles. This interpretability provides lots of potential
for designing better recommender systems.