Quick links

CS Department Colloquium Series

Eliminating Bugs in Real Systems

Date and Time
Thursday, March 18, 2021 - 12:30pm to 1:30pm
Location
Zoom Webinar (off campus)
Type
CS Department Colloquium Series
Host
Wyatt Lloyd

Please register here


Fraser Brown
Software is everywhere, and almost everywhere, software is broken. Some bugs just crash your printer; others hand an identity thief your bank account number; still others let nation-states spy on dissidents and persecute minorities.

This talk outlines my work preventing bugs using a blend of programming languages techniques and systems design. First, I'll talk about securing massive, security-critical codebases without clean slate rewrites. This means rooting out hard-to-find bugs---as in Sys, which scales symbolic execution to find exploitable bugs in systems like the twenty-million line Chrome browser. It also means proving correctness of especially vulnerable pieces of code---as in VeRA, which automatically verifies part of the Firefox JavaScript engine. Finally, I'll discuss work on stronger foundations for new systems---as in CirC, a recent project unifying compiler infrastructure for program verification, cryptographic proofs, optimization problems, and more.

Bio: Fraser Brown is a PhD student at Stanford advised by Dawson Engler, occasional visiting student at UCSD with Deian Stefan, and NSF graduate research fellowship recipient. She works at the intersection of programming languages, systems, and security, and her research has been used by several companies. She holds an undergraduate degree in English from Stanford.


To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Reliable Machine Learning in Feedback Systems

Date and Time
Monday, March 29, 2021 - 4:30pm to 5:30pm
Location
Zoom Webinar (off campus)
Type
CS Department Colloquium Series
Host
Elad Hazan

Please register here


Sarah Dean
Machine learning techniques have been successful for processing complex information, and thus they have the potential to play an important role in data-driven decision-making and control. However, ensuring the reliability of these methods in feedback systems remains a challenge, since classic statistical and algorithmic guarantees do not always hold.

In this talk, I will provide rigorous guarantees of safety and discovery in dynamical settings relevant to robotics and recommendation systems. I take a perspective based on reachability, to specify which parts of the state space the system avoids (safety) or can be driven to (discovery). For data-driven control, we show finite-sample performance and safety guarantees which highlight relevant properties of the system to be controlled. For recommendation systems, we introduce a novel metric of discovery and show that it can be efficiently computed. In closing, I discuss how the reachability perspective can be used to design social-digital systems with a variety of important values in mind.

Bio: Sarah is a PhD candidate in the Department of Electrical Engineering and Computer Science at UC Berkeley, advised by Ben Recht. She received her MS in EECS from Berkeley and BSE in Electrical Engineering and Math from the University of Pennsylvania. Sarah is interested in the interplay between optimization, machine learning, and dynamics in real-world systems. Her research focuses on developing principled data-driven methods for control and decision-making, inspired by applications in robotics, recommendation systems, and developmental economics. She is a co-founder of a transdisciplinary student group, Graduates for Engaged and Extended Scholarship in computing and Engineering, and the recipient of a Berkeley Fellowship and a NSF Graduate Research Fellowship.


To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Data Structures and Algorithms in Sublinear Computation

Date and Time
Thursday, March 4, 2021 - 12:30pm to 1:30pm
Location
Zoom Webinar (off campus)
Type
CS Department Colloquium Series
Host
Zeev Dvir

Please register here


Huacheng Yu
Sublinear algorithms are sustainable under the exponential increase of data volume and processing speed. Such algorithms use a sublinear amount of resources, e.g., spending time, space, or communication that is asymptotically smaller than the input data size. Typical examples include data structures, which compute query functions of the data in sublinear time, and streaming algorithms, which make one pass over massive data streams while maintaining a sublinear-sized memory. 

In this talk, I will give an overview of my work in sublinear computation, focusing on succinct data structures and distributed graph sketching algorithms. I will first discuss my work on a nearly optimal data structure for the dictionary problem, for which the textbook solution uses hash tables. Then, I will talk about detecting the connectivity of graphs using distributed sketching, and my recent work showing the optimality of a well-known sketching algorithm (the AGM sketch). I will conclude the talk with discussion on future directions and my other work in theoretical computer science. 

Bio: Huacheng Yu is an associate research scholar in the Department of Computer Science at Princeton University. His research interests include data structures and streaming algorithms, and other directions in theoretical computer science such as communication complexity and graph algorithms. Prior to Princeton, Huacheng was a postdoctoral researcher at Harvard University hosted by Jelani Nelson and Madhu Sudan. He received his Ph.D. from Stanford University (advised by Ryan Williams and Omer Reingold) and B.Eng from Tsinghua University, both in Computer Science. 


To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Probabilistic Proofs: Theory, Hardware, and Everything in Between

Date and Time
Monday, March 22, 2021 - 12:30pm to 1:30pm
Location
Zoom Webinar (off campus)
Type
CS Department Colloquium Series
Host
Andrew Appel

Please register here


Riad Wahby
In the past decade, systems that use probabilistic proofs in real-world applications have seen explosive growth. These systems build upon some of the crown jewels of theoretical computer science---interactive proofs, probabilistically checkable proofs, and zero-knowledge proofs---to solve problems of trust and privacy in a wide range of settings.

This talk describes my work building systems that answer questions ranging from "how can we build trustworthy hardware that uses untrusted components?" to "how can we reduce the cost of verifying smart contract execution in blockchains?" Along the way, I will discuss the pervasive challenges of efficiency, expressiveness, and scalability in this research area; my approach to addressing these challenges; and future directions that promise to bring this exciting technology to bear on an even wider range of applications.

Bio: Riad S. Wahby is a Ph.D. candidate at Stanford, advised by Dan Boneh and Keith Winstein.  His research interests include systems, computer security, and applied cryptography. Prior to attending Stanford, Riad spent ten years as an analog and mixed-signal integrated circuit designer. Riad and his collaborators received a 2016 IEEE Security and Privacy Distinguished Student Paper award; his work on hashing to elliptic curves is being standardized by the IETF.


To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Resource-Efficient Execution for Deep Learning

Date and Time
Wednesday, March 10, 2021 - 4:30pm to 5:30pm
Location
Zoom Webinar (off campus)
Type
CS Department Colloquium Series
Host
Kyle Jamieson

Please register here.


Deepak Narayanan
Deep Learning models have enabled state-of-the-art results across a broad range of applications; however, training these models is extremely time- and resource-intensive, taking weeks on clusters with thousands of expensive accelerators in the extreme case. In this talk, I will describe two systems that improve the resource efficiency of model training. The first system, PipeDream, proposes the use of pipelining to accelerate distributed training. Pipeline parallelism facilitates model training with lower communication overhead than previous methods while still ensuring high compute resource utilization. Pipeline parallelism also enables the efficient training of large models that do not fit on a single worker. Pipeline parallelism is being used at Facebook, Microsoft, OpenAI, and Nvidia for efficient large-scale model training. The second system, Gavel, determines how resources in a shared cluster with heterogeneous compute resources (e.g., different types of hardware accelerators) should be partitioned among different users to optimize objectives specified over multiple training jobs. Gavel can improve various scheduling objectives, such as average completion time, makespan, or cloud computing resource cost, by up to 3.5x. I will conclude the talk with discussion on future directions for optimizing Machine Learning systems.

Bio: Deepak Narayanan is a final-year PhD student at Stanford University advised by Prof. Matei Zaharia. He is interested in designing and building software to improve the runtime performance and efficiency of emerging machine learning and data analytics workloads on modern hardware. His work is supported by a NSF graduate fellowship.


This talk will be recorded.  To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Integrating Machine Learning into Algorithm Design

Date and Time
Monday, February 22, 2021 - 12:30pm to 1:30pm
Location
Zoom Webinar (off campus)
Type
CS Department Colloquium Series
Host
Sanjeev Arora

Please register here.


Ellen Vitercik
An important property of those algorithms that are typically used in practice is broad applicability—the ability to solve problems across diverse domains. However, the default, out-of-the-box performance of these algorithms can be unsatisfactory, with slow runtime, poor solution quality, and even negative long-term social ramifications. In practice, there is often ample data available about the types of problems that an algorithm will be run on, data that can potentially be harnessed to fine-tune the algorithm’s performance. We therefore need principled approaches for using this data to obtain strong application-specific performance guarantees.

In this talk, I will give an overview of my research that provides practical methods built on firm theoretical foundations for incorporating machine learning and optimization into the process of algorithm design, selection, and configuration. I will describe my contributions across several diverse domains, including integer programming, clustering, mechanism design, and computational biology. As I will demonstrate, these seemingly disparate areas are connected by overarching structure which implies broadly-applicable guarantees.

Bio: Ellen Vitercik is a PhD student at Carnegie Mellon University where she is co-advised by Maria-Florina Balcan and Tuomas Sandholm. Her research revolves around artificial intelligence, algorithm design, and the interface between economics and computation, with a particular focus on machine learning theory. Among other honors, she is a recipient of the Exemplary Artificial Intelligence Track Paper Award at EC'19, the Best Presentation by a Student or Postdoctoral Researcher Award at EC'19, the NSF Graduate Research Fellowship, the IBM PhD Fellowship, the Fellowship in Digital Health from CMU's Center for Machine Learning and Health, and the Teaching Assistant of the Year Award from CMU's Machine Learning Department.


This talk will be recorded.  To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Exploiting Latent Structure and Bisimulation Metrics for Better Generalization in Reinforcement Learning

Date and Time
Monday, March 8, 2021 - 4:30pm to 5:30pm
Location
Zoom Webinar (off campus)
Type
CS Department Colloquium Series
Host
Tom Griffiths

Please register here.


Amy Zhang
The advent of deep learning has shepherded unprecedented progress in various fields of machine learning. Despite recent advances in deep reinforcement learning (RL) algorithms, however, there is no method today that exhibits anywhere near the generalization that we have seen in computer vision and NLP. Indeed, one might ask whether deep RL algorithms are even capable of the kind of generalization that is needed for open-world environments.  This challenge is fundamental and will not be solved with incremental algorithmic advances. 

In this talk, we propose to incorporate different assumptions that better reflect the real world and allow the design of novel algorithms with theoretical guarantees to address this fundamental problem. We first present how state abstractions can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction. Our goal is to learn state abstractions that both provide for effective downstream control and invariance to task-irrelevant details. We use bisimulation metrics to quantify behavioral similarity between states, and learn robust latent representations which encode only the task-relevant information from observations. We provide theoretical guarantees for the learned approximate abstraction and extend this notion to families of tasks with varying dynamics.

Bio: I am a final year PhD candidate at McGill University and the Mila Institute, co-supervised by Profs. Joelle Pineau and Doina Precup. I am also a researcher at Facebook AI Research. My work focuses on bridging theory and practice through learning approximate state abstractions and learning representations for generalization in reinforcement learning. I previously obtained an M.Eng. in EECS and dual B.Sci. degrees in Mathematics and EECS from MIT.


To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

The Measurement and Mismeasurement of Trustworthy ML

Date and Time
Monday, March 1, 2021 - 12:30pm to 1:30pm
Location
Zoom Webinar (off campus)
Type
CS Department Colloquium Series
Host
Barbara Engelhardt

Please register here.


Sanmi Koyejo
Across healthcare, science, and engineering, we increasingly employ machine learning (ML) to automate decision-making that, in turn, affects our lives in profound ways. However, ML can fail, with significant and long-lasting consequences. Reliably measuring such failures is the first step towards building robust and trustworthy learning machines. Consider algorithmic fairness, where widely-deployed fairness metrics can exacerbate group disparities and result in discriminatory outcomes. Moreover, existing metrics are often incompatible. Hence, selecting fairness metrics is an open problem. Measurement is also crucial for robustness, particularly in federated learning with error-prone devices. Here, once again, models constructed using well-accepted robustness metrics can fail. Across ML applications, the dire consequences of mismeasurement are a recurring theme. This talk will outline emerging strategies for addressing the measurement gap in ML and how this impacts trustworthiness.

Bio: Sanmi (Oluwasanmi) Koyejo is an Assistant Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. Koyejo's research interests are in developing the principles and practice of trustworthy machine learning. Additionally, Koyejo focuses on applications to neuroscience and healthcare. Koyejo has been the recipient of several awards, including a best paper award from the conference on uncertainty in artificial intelligence (UAI), a Kavli Fellowship, an IJCAI early career spotlight, and a trainee award from the Organization for Human Brain Mapping (OHBM). Koyejo serves on the board of the Black in AI organization.


To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Improving Stack-Wide Resource Utilization for a Faster Mobile Web

Date and Time
Monday, March 23, 2020 - 12:30pm to 1:30pm
Location
Zoom (off campus)
Type
CS Department Colloquium Series
Host
Karthik Narasimhan

***Due to the developing situation surrounding the COVID-19 virus, this talk will be available for remote viewing.  See below for details.***

Ravi Netravali
Abstract: Mobile web pages are integral to today's society, supporting critical services such as education, e-commerce, and social networking. Despite considerable academic and industrial research efforts, and major improvements over the past decade across the client-side web stack (i.e., networks, device CPUs, and browser engines), page load performance has plateaued and continues to fall short of user performance demands in practice. The consequences of this are far reaching: users abandon pages early, costing content providers billions of dollars in lost revenue; or pages are unusably slow, particularly in developing regions where web pages are often the sole gateway to the aforementioned services.

In this talk, I will describe the origin of this performance plateau in the context of serialized page load tasks that preclude effective utilization of the underlying network and CPU resources. Then, I will describe two complementary optimizations that my students and I have developed to eliminate these inefficiencies throughout the page load process and cut mobile load times in half. Key to these optimizations are judicious applications of programming languages (e.g., symbolic execution) and machine learning (e.g., reinforcement learning) techniques that enable us to 1) discover optimization knobs that preserve application correctness, and 2) tune those knobs according to stack-wide signals from the network, device, page, and browser, without developer intervention. I will conclude by describing how these underlying techniques can motivate and address a range of future challenges in networked applications and distributed systems. 

Bio: Ravi Netravali is an Assistant Professor of Computer Science at UCLA. His research interests are broadly in computer systems and networking, with a recent focus on building practical systems to improve the performance and debugging of large-scale, distributed applications for both end users and developers. His research has been recognized with an NSF CAREER Award, a Google Faculty Research Award, an ACM SoCC Best Paper Award, and an IRTF Applied Networking Research Prize. Prior to joining UCLA, Netravali received a PhD in Computer Science from MIT in 2018.


Zoom information:
Topic: Ravi Netravali CS Seminar
Time: Mar 23, 2020 12:00 PM Eastern Time (US and Canada)

Join Zoom Meeting
https://princeton.zoom.us/j/645162020

Meeting ID: 645 162 020

One tap mobile
+13126266799,,645162020# US (Chicago)
+16465588656,,645162020# US (New York)

Dial by your location
+1 312 626 6799 US (Chicago)
+1 646 558 8656 US (New York)
+1 253 215 8782 US
+1 301 715 8592 US
+1 346 248 7799 US (Houston)
+1 669 900 6833 US (San Jose)
Meeting ID: 645 162 020
Find your local number: https://princeton.zoom.us/u/avcvlf1F3

Join by SIP
645162020@zoomcrc.com

Join by H.323
162.255.37.11 (US West)
162.255.36.11 (US East)
221.122.88.195 (China)
115.114.131.7 (India Mumbai)
115.114.115.7 (India Hyderabad)
213.19.144.110 (EMEA)
103.122.166.55 (Australia)
209.9.211.110 (Hong Kong)
64.211.144.160 (Brazil)
69.174.57.160 (Canada)
207.226.132.110 (Japan)
Meeting ID: 645 162 020

Deep Probabilistic Graphical Modeling

Date and Time
Thursday, March 12, 2020 - 12:30pm to 1:30pm
Location
Zoom (off campus)
Type
CS Department Colloquium Series
Host
Ryan Adams

***Due to the developing coronavirus situation, this talk will now be available for remote viewing via Zoom.  See below for full details.***

Adji Bousso Dieng
Abstract: Deep learning (DL) is a powerful approach to modeling complex and large scale data. However, DL models lack interpretable quantities and calibrated uncertainty. In contrast, probabilistic graphical modeling (PGM) provides a framework for formulating an interpretable generative process of data and a way to express uncertainty about what we do not know. How can we develop machine learning methods that bring together the expressivity of DL with the interpretability and calibration of PGM to build flexible models endowed with an interpretable latent structure that can be fit efficiently? I call this line of research deep probabilistic graphical modeling (DPGM). In this talk, I will discuss my work on developing DPGM both on the modeling and algorithmic fronts. In the first part of the talk I will show how DPGM enables learning document representations that are highly predictive of sentiment without requiring supervision. In the second part of the talk I will describe entropy-regularized adversarial learning, a scalable and generic algorithm for fitting DPGMs. 

Bio: Adji Bousso Dieng is a PhD Candidate at Columbia University where she is jointly advised by David Blei and John Paisley. Her research is in Artificial Intelligence and Statistics, bridging probabilistic graphical models and deep learning. Dieng is supported by a Dean Fellowship from Columbia University. She won a Microsoft Azure Research Award and a Google PhD Fellowship in Machine Learning. She was recognized as a rising star in machine learning by the University of Maryland.  Prior to Columbia, Dieng worked as a Junior Professional Associate at the World Bank. She did her undergraduate studies in France where she attended Lycee Henri IV and Telecom ParisTech--France's Grandes Ecoles system. She spent the third year of Telecom ParisTech's curriculum at Cornell University where she earned a Master in Statistics.


Topic: Adji Bousso Dieng CS Seminar
Time: Mar 12, 2020 12:30 PM Eastern Time (US and Canada)

Join Zoom Meeting
https://princeton.zoom.us/j/384273957 

Meeting ID: 384 273 957

One tap mobile
+16465588656,,384273957# US (New York)
+16699006833,,384273957# US (San Jose)

Dial by your location
        +1 646 558 8656 US (New York)
        +1 669 900 6833 US (San Jose)
Meeting ID: 384 273 957
Find your local number: https://princeton.zoom.us/u/abUHt2KPwU

Join by SIP
384273957@zoomcrc.com

Join by H.323
162.255.37.11 (US West)
162.255.36.11 (US East)
221.122.88.195 (China)
115.114.131.7 (India Mumbai)
115.114.115.7 (India Hyderabad)
213.19.144.110 (EMEA)
103.122.166.55 (Australia)
209.9.211.110 (Hong Kong)
64.211.144.160 (Brazil)
69.174.57.160 (Canada)
207.226.132.110 (Japan)
Meeting ID: 384 273 957

Follow us: Facebook Twitter Linkedin