Arushi Gupta FPO | Computer Science Department at Princeton University

Date and Time

Thursday, February 8, 2024 - 1:30pm to 3:30pm

Location

Computer Science 402

Type

FPO

Arushi Gupta will present her FPO "Understanding the Role of Data in Model Decisions" on Thursday, February 8, 2024 at 1:30 PM in CS 402.

Location: CS 402

The members of Arushi’s committee are as follows:
Examiners: Sanjeev Arora (Adviser), Elad Hazan, Ryan Adams
Readers: Karthik Narasimhan, Tom Griffiths

A copy of her thesis is available upon request. Please email gradinfo (@cs.princeton.edu) if you would like a copy of thethesis.

Everyone is invited to attend her talk.

Abstract follows below:

As neural networks are increasingly employed in high stakes applications such as criminal justice, medicine, etc, [1] it becomes increasingly important to understand why these models make the decisions they do. For example, it is important to develop tools to analyze whether models are perpetuating harmful demographic inequalities they have found in their training data in their future decision making [2]. However, neural networks typically require large training sets, have “black-box” decision making, and have costly retraining protocols, increasing the difficulty of this problem. This work considers three questions. Q1) What is the relationship between the elements of an input and the models’ decision? Q2) What is the relationship between the individual training points and the model’s decision. And finally Q3) to what extent do there exist (efficient) approximations that would allow practitioners to predict how model performance would change given different training data, or a different training protocol.

Part I addresses Q1 for masking saliency methods. These methods implicitly assume that grey pixels in an image are “uninformative.” We find experimentally that this assumption may not always be true, and define “soundness,” which measures a desirable property of a saliency map.

Part II addresses Q2 and Q3 in the context of influence functions, which aim to approximate the effect of removing a training points on the model’s decision. We use harmonic analysis to examine a particular type of influence method, namely datamodels, and find that there is a relationship between the coefficients of the datamodel, and the Fourier coefficients of the target function.

Finally, Part III addresses Q3 in the context of test data. First, we assess whether held out test data is necessary to approximate the outer loop of meta learning, or whether recycling training data constitutes a sufficient approximation. We find that held out test data is important, as it learns representations that are low rank. Then, inspired by the PGDL competition [3] we investigate whether GAN generated data, despite well known limitations, can be used to approximate generalization performance when no test or validation set is available, and find that they can.