Applications of Latent Variable Models in Modeling Influence and Decision Making | Computer Science Department at Princeton University

Report ID:

TR-944-13

Authors:

Gerrish, Sean

Date:

December 2012

Pages:

165

Download Formats:

[Postscript], [PDF]

Abstract:

The past twenty years have seen an avalanche of digital information which is overwhelming people in industry, government, and academics. This avalanche is two-sided: while the past decade has seen an onslaught of digitized records -- as governments, publishers, and researchers race to make their records digital, the electronic and software tools for computationally analyzing this data have quickly evolved to face this challenge.

Many of these challenges evolve around recurring patterns, including the presence of text, bits of information about pairs of items, and sequential observations. In this work we present several methods to address these challenges in data analysis which take advantage of these recurring patterns.

We begin with a method for identifying influential documents in a collection which evolves over time. We demonstrate that by encoding our assumptions about influential documents in a statistical model of the changes in textual themes, we are able to provide an alternative bibliometric which provides results consistent with---yet different from---traditional metrics of influence such as citation counts.

We then introduce a model for measuring the relationships between pairs of countries over time. We will demonstrate that this model is able to learn meaningful relationships between countries which is extraordinarily consistent across different human labels.

We next address limitations in existing models of legislative voting. In one extension we predict legislators' votes by using the text of the bills they are voting on combined with individual legislators' past voting behavior. We then introduce a method for inferring these lawmakers' positions on specific issues.

A recurring theme in the methods we present is that by using a small set of statistical primitives, we are able to apply known (or mildly adapted) methods to new problems. Several advances in the past few decades in statistical modeling will make the development and discussion of our models easier, as they will provide both this set of primitives (which can be interchanged easily) and the tools for working with them. As a final contribution, we describe a new method for fitting a statistical model with variational inference, without the time investment typically required of practitioners.