FPO | Page 4 | Computer Science Department at Princeton University

Zhenyu Song will present his FPO "Optimizing Content Distribution Network Caches with Machine Learning" on Tuesday, August 1, 2023 at 1pm in CS 402.

Date and Time

Tuesday, August 1, 2023 - 1:00pm to 3:00pm

Location

Computer Science 402

Type

FPO

Zhenyu Song will present his FPO "Optimizing Content Distribution Network Caches with Machine Learning" on Tuesday, August 1, 2023 at 1pm in CS 402.

The members of his committee are as follows:

Readers: Ravi Netravali Daniel S. Berger (Microsoft & UW)

Examiners: Kai Li (Adviser), Wyatt Lloyd (Adviser), and Ryan Adams

All are welcome to attend.

Abstract:

Content Distribution Networks (CDNs) play a pivotal role in Internet traffic. A key part of this caching mechanism is the eviction algorithm that handles the replacement of old cached objects. The effectiveness of the eviction algorithm significantly influences CDN performance.

This dissertation explores the application of machine learning (ML) to optimize cache eviction algorithms in CDNs. The central questions addressed in this work are: how to utilize ML to devise an eviction algorithm that surpasses existing heuristics on byte miss ratio, and how to mitigate the CPU overhead while enhancing the robustness of a learned cache in large-scale deployment.

Two major challenges faced in the design of a learning-based cache eviction algorithm include heterogeneous user access patterns across different locations and times, and computational and space overheads. To address these challenges, we developed two ML-based eviction algorithms, Learning Relaxed Belady (LRB) and Heuristic Aided Learned Preference (HALP). LRB is the first CDN cache algorithm to directly approximate the Belady MIN (oracle) algorithm by learning access patterns, providing a significant improvement over traditional eviction algorithms. It demonstrated a WAN traffic reduction of 4-25% across six production CDN traces in our simulation. HALP, on the other hand, achieves low CPU overhead and robust DRAM byte miss ratio improvement by augmenting a heuristic policy with ML. It has shown to reduce DRAM byte miss during peak by an average of 9.1%, with a modest CPU overhead of 1.8%, while deployed in YouTube CDN production clusters.

Jiaxin Guan FPO

Date and Time

Wednesday, July 12, 2023 - 1:00pm to 3:00pm

Location

Computer Science Small Auditorium (Room 105)

Type

FPO

Jiaxin Guan will present his FPO "Cryptography against Space-Bounded Adversaries" on July 12, 2023 at 1pm in CS 105

The members of his committee are as follows:
Examiners: Mark Zhandry (Advisor), Gillat Kol, Ran Raz
Readers: Mark Braverman, Matt Weinberg

Abstract:
Traditionally in cryptography, we consider adversaries that are time-bounded by making specific computational assumptions. In this thesis, I examine the scenario where the adversaries are space-bounded instead, i.e. the adversary can only use up to a certain amount of memory bits. Under these scenarios, we can achieve either unconditional security properties or never-before-possible results.

First, we start off with Maurer's Bounded Storage Model. It is a model where the adversary abides to a certain memory bound throughout the entire game. Under this model, I show simple constructions of a key-agreement protocol, a commitment scheme, and an oblivious transfer protocol, all based on Raz's lower bound on parity learning. These constructions have several advantages over prior work, including an improved number of rounds and enhanced correctness.

Subsequently, I show that if we combine computational assumptions with the bounded storage model, we can achieve results that are not possible in the standard model. I define a new object named Online Obfuscation, and show how to use it to construct disappearing encryption and signature schemes where the ciphertext and the signature effectively "disappear" after transmission.

Lastly, I notice that in the Bounded Storage Model, the memory bound on the adversary is enforced throughout the entire game. One can imagine a variant where the bound is only enforced for long-term storage, allowing the adversary to use an arbitrary amount of memory during the transmission phase. I define incompressible cryptography to capture this intuition and show constructions using randomness extractors and other cryptographic tools. Furthermore, I show that under the multi-user setting, we can still achieve desired incompressible security if we simply replace the randomness extractor with a special "somewhere randomness extractor".

Clayton Thomas FPO (CS 105)

Date and Time

Wednesday, July 5, 2023 - 10:00am to 12:00pm

Location

Computer Science 402

Type

FPO

Clayton Thomas will present his FPO "Explainable Mechanism Design" on Wednesday, July 5, 2023 at 10am in CS 402 and Zoom.

Please see below for the members of his committee:
Readers: Huacheng Yu, Ran Raz
Examiners: Matt Weinberg (Adviser), Mark Braverman, and Leeat Yariv (Princeton Economics)

https://princeton.zoom.us/j/97275041458
Meeting ID: 972 7504 1458

Talk abstract follows below. All are welcome to join.

Abstract: Mechanism design strives to construct protocols that achieve desirable outcomes in the presence of strategic behavior. Algorithmic mechanism design additionally factors in notions from theoretical computer science—the study of how efficiently these algorithms can communicate between and be implemented on computers. This thesis largely takes a different direction. We study how to effectively explain these algorithms and protocols to real humans. In Part I, we directly confront the objective of better explaining a certain canonical and widely-deployed class of algorithms, namely, matching mechanisms. To begin, we study certain prior frameworks for explanation, and find them to be very restrictive in this setting. Thus, we introduce a new framework for describing these algorithms to participants in a fundamentally different way, with the goal of exposing strategy proofness, a crucial property dictating one’s optimal behavior in these mechanisms. We propose new descriptions, conduct a laboratory experiment on real participants to test our framework, and prove rigorous theorems about the limits of simple descriptions in our framework (according to context-relevant formal notions of simplicity). Then, we investigate a host of additional types of descriptions and explanations using the theoretical tools we have developed.

In Part II, we investigate other novel directions in algorithmic mechanism design that stem from running mechanisms in the real world. We provide two new, worst case separations in the communication requirements of certain canonical mechanism
design problems: first, for incentivizing mechanisms vs. simply computing them; second, for providing strong incentive grantees vs. providing weak ones. We provide a framework for running more complex mechanisms with computationally limited bidders; this framework allows for a broad class of reasonable behaviors of the bidders, and guarantees good performance under modest computational resources.

In Part III, we investigate perhaps more-traditional properties of matching mechanisms, the vital class of mechanisms studied in Part I. We provide new proofs of classical results on both the structure and dynamics of these mechanisms—some of
these proofs are far simpler than known ones. Finally, we investigate a new random distribution over the preferences of agents in these mechanisms, characterizing the effect of heterogeneous attractiveness across different agents.

Manik Dhar FPO (CS 105)

Date and Time

Friday, June 30, 2023 - 12:00pm to 2:00pm

Location

Computer Science Small Auditorium (Room 105)

Type

FPO

Manik Dhar will present his FPO "Beyond the polynomial method: Kakeya sets over finite rings and high dimensional variants" on Friday, June 30, 2023 at 12pm in CS 105.

The members of his committee are as follows: Readers: Ran Raz and Huacheng Yu; Examiners: Zeev Dvir (adviser), Bernard Chazelle, and Noga Alon (Math Department)

All are welcome to attend. Abstract follows below:

A Kakeya Set in Fn is a set that contains a line in every direction. It has been known for over a decade that such sets must be large. This goes back to Dvir’s proof of the finite field Kakeya conjecture as posed by Wolff. Dvir’s proof uses the polynomial method to solve this problem. The problem over finite fields also has applications in psuedorandomness and is crucially used in the construction of currently state-of-the-art seeded mergers and extractors. Wolff’s motivation was to pose a possibly simpler problem whose resolution might help with solving the notoriously difficult Euclidean Kakeya conjecture. The Euclidean Kakeya problem also has many connections with analysis, PDEs, and combinatorics. This thesis will look at two generalizations of the finite field Kakeya problem. One where Fq is replaced by the ring Z/N Z and another where we look at higher dimensional analogues of this problem. In both cases, the polynomial method does not work - as over Z/N Z polynomials do not represent all functions and in the other case a degree d polynomial can have much larger than d roots over a higher dimensional affine flat. We will cover a series of results that solve these problems by extending and in some cases going beyond the polynomial method. We will also cover recent applications of the higher dimensional Kakeya problem to give a tight analysis of the behaviour of linear hashes. At a high level, we show that if we pick subspaces of large enough dimension in comparison to the size of a set then for a large fraction of them their every shift intersects with the set in close to the expected number of points.

Diana Cai FPO

Date and Time

Wednesday, July 5, 2023 - 3:00pm to 5:00pm

Location

Not yet determined.

Type

FPO

Diana Cai will present her FPO "Probabilistic Inference When the Model is Wrong" on Wednesday, July 5, 2023 at 3:00 PM in COS 105.

Location: COS 105

The members of Diana’s committee are as follows:
Examiners: Ryan Adams (Adviser), Barbara Engelhardt (Adviser), Olga Russakovsky
Readers: Tom Griffiths, Tamara Broderick (MIT)

A copy of her thesis is available upon request. Please email gradinfo@cs.princeton.edu if you would like a copy of the thesis.

Everyone is invited to attend her talk.

Abstract follows below:
By simplifying complex real-world phenomena, probabilistic methods have proven able to accelerate applications in discovery and design. However, classical theory often evaluates models under the assumption that they are perfect representations of the observed data. There remains the danger that these simplifications might sometimes lead to failure under real-world conditions. This dissertation identifies popular data analyses that can yield unreliable conclusions—and in some cases ones that are arbitrarily unreliable—under such “misspecification.” But we also show how to practically navigate misspecification. To begin, we consider clustering, a mainstay of modern unsupervised data analysis, using Bayesian finite mixture models. Some scientists are interested not only in finding meaningful groups of data but also in learning the number of such clusters. We provide novel theoretical results and empirical studies showing that, no matter how small the misspecification, some common approaches, including Bayesian robustness procedures, for learning the number of clusters give increasingly wrong answers as one receives more data. But using imperfect models need not be hopeless. For instance, we consider a popular Bayesian modeling framework for graphs based on the assumption of vertex exchangeability. A consequence of this assumption is that the resulting graph models generate dense graphs with probability 1 and are therefore misspecified for sparse graphs, a common property of many real-world graphs. To address this undesirable scaling behavior, we introduce an alternative generative modeling framework and prove that it generates a range of sparse and dense scaling behaviors; we also show empirically that it can generate graphs with sparse power law scaling behavior. Finally, we consider the case where a researcher has access to a sequence of approximate models that become arbitrarily more complex at the cost of more computation, which is common in applications with simulators of physical dynamics or models requiring numerical approximations of some fidelity. In this case, we show how to obtain estimates as though one had access to the most complex model. In particular, we propose a framework for constructing Markov chain Monte Carlo algorithms that asymptotically simulates from the most complex model while only ever evaluating models from the sequence of approximate models.

Ruairidh Battleday FPO

Date and Time

Thursday, June 29, 2023 - 1:00pm to 3:00pm

Location

Not yet determined.

Type

FPO

Ruairidh Battleday will present his FPO "The Role Of Nonparametric Inference In Computational Models Of Categorization And Analogy" on Thursday, June 29, 2023 at 1:00 PM in COS 401 and Zoom.

Location: Zoom link: https://princeton.zoom.us/j/91969001576

The members of Ruairidh’s committee are as follows:
Examiners: Tom Griffiths (Adviser), Ryan Adams, Barbara Engelhardt
Readers: Olga Russakovsky, Tania Lombrozo

A copy of his thesis is available upon request. Please email gradinfo@cs.princeton.edu if you would like a copy of the thesis.

Everyone is invited to attend his talk.

Abstract follows below:

This dissertation presents a computational investigation into the nature of human learning and inference. Its central thesis is that humans make good predictions in novel environments because they possess flexible abilities for inductive inference that can be used to generalize and update relevant knowledge abstracted from the past. This thesis is first explored in the context of natural image categorization. Here, participants’ judgments are best captured by models that use probabilistic strategies to relate novel stimuli to existing category members, and techniques from deep learning to represent these stimuli in high-dimensional mathematical spaces. The second context is generalization, where the underlying structure of different training games is shown to affect participants’ predictive generalizations on a final test game. These behaviors are accounted for by a nonparametric Bayesian model that aggregates mathematical abstractions of particular training environments, and then weights their predictions about unobserved interactions by their analogical relevance. Common to both lines of inquiry is the idea of using probabilistic models to account for flexible inferential behaviors, and tools from statistics and machine learning to abstract relevant mathematical representations from past experiences or stimuli.

Wei Zhan FPO

Date and Time

Friday, June 23, 2023 - 10:00am to 12:00pm

Location

Not yet determined.

Type

FPO

Details to follow

Minsung Kim FPO "Quantum and Quantum-Inspired Computation for Wireless Networks"

Date and Time

Tuesday, October 3, 2023 - 2:00pm to 3:30pm

Location

Not yet determined.

Type

FPO

Title: Quantum and Quantum-Inspired Computation for Wireless Networks.

Examiners:

Prof. Kyle Jamieson (CS, advisor): kylej (@cs.princeton.edu)

Prof. Jennifer Rexford (CS): jrex (@cs.princeton.edu)

Prof. Yasaman Ghasempour (ECE): ghasempour (@princeton.edu)

Readers:

Prof. Ravi Netravali (CS): rnetravali (@cs.princeton.edu)

Prof. Lin Zhong (Yale): lin.zhong (@yale.edu)

Dr. Davide Venturelli (NASA/USRA RIACS): DVenturelli (@usra.edu)

Vikram Ramaswamy FPO

Date and Time

Monday, May 8, 2023 - 12:00pm to 2:00pm

Location

Computer Science 402

Type

FPO

Vikram Ramaswamy will present his FPO "Tackling bias within Computer Vision Models" on Monday, May 8, 2023 at 12:00 PM in CS 402.

Location: CS 402

The members of Vikram’s committee are as follows: Examiners: Olga Russakovsky (Adviser), Ellen Zhong, Ryan Adams
Readers: Jia Deng, Andrés Monroy-Hernández

A copy of his thesis is available upon request. Please email gradinfo (@cs.princeton.edu) if you would like a copy of the thesis.

Everyone is invited to attend his talk.

Abstract follows below:
Over the past decade the rapid increase in the ability of computer vision models has led to their applications in a variety of real-world applications from self-driving cars to medical diagnoses. However, there is increasing concern about the fairness and transparency of these models. In this thesis, we tackle these issue of bias within these models along two different axes. First, we consider the datasets that these models are trained on. We use two different methods to create a more balanced training dataset. First, we create a synthetic balanced dataset by sampling strategically from the latent space of a generative network. Next, we explore the potential of creating a dataset through a method other than scraping the internet: we solicit images from workers around the world, creating a dataset that is balanced across different geographical regions. Both techniques are shown to help create models with less bias. Second, we consider methods to improve interpretability of these models, which can then reveal potential biases within the model. We investigate a class of interpretability methods called concept-based methods that output explanations for models in terms of human understandable semantic concepts. We demonstrate the need for more careful development of the datasets used to learn the explanation as well as the concepts used within these explanations. We construct a new method that allows for users to select a trade-off between the understandability and faithfulness of the explanation. Finally, we discuss how methods that completely explain a model can be developed, and provide heuristics for the same.

Yuting Yang FPO

Date and Time

Wednesday, May 3, 2023 - 3:00pm to 5:00pm

Location

Computer Science 402

Type

FPO

Yuting Yang will present her FPO "Exploiting Program Representation with Shader Applications" on Wednesday, May 3, 2023 at 3:00 PM in CS 402.

Location: CS 402

The members of Yuting’s committee are as follows:
Examiners: Adam Finkelstein (Adviser), Szymon Rusinkiewicz, Jia Deng
Readers: Felix Heide, Connelly Barnes (Adobe Research)

A copy of her thesis will be available upon request, two weeks before the FPO. Please email gradinfo (@cs.princeton.edu) if you would like a copy of the thesis.

Everyone is invited to attend her talk.

Abstract follows below:
Programs are widely used in content creation. For example, artists design shader programs to procedurally render scenes and textures, while musicians construct “synth” programs to generate electronic sound. While the generated content is typically the focus of attention, the programs themselves offer hidden potential for transformations that can support untapped applications. In this dissertation, we will discuss four projects that exploit the program structure to automatically apply machine learning or math transformations as if they were manually designed by domain experts. First, we describe a compiler-based framework with novel math rules to extend reverse mode automatic differentiation so as to provide accurate gradients for arbitrary discontinuous programs. The differentiation framework allows us to optimize procedural shader parameters to match target images. Second, we extend the differentiation framework to audio “synth” programs so as to match the acoustic properties of a provided sound clip. We next propose a compiler framework to automatically approximate the convolution of an arbitrary program with a Gaussian kernel in order to smooth the program for visual antialiasing. Finally, we explore the benefit of program representation in deep-learning tasks by proposing to learn from program traces of procedural fragment shaders – programs that generate images. In each of these settings, we demonstrate the benefit of exploiting the program structure to generalize hand-crafted techniques to arbitrary programs.