CS Department Colloquium Series | Page 2 | Computer Science Department at Princeton University

Mathematical Foundations for Physical Agents

Date and Time

Wednesday, March 20, 2024 - 12:30pm to 1:30pm

Location

Engineering Quadrangle B205

Type

CS Department Colloquium Series

Speaker

Max Simchowitz, from Massachusetts Institute of Technology

Host

Elad Hazan, Chi Jin

Website

https://ece.princeton.edu/events/mathematical-foundations-physical-agents

From robotics to autonomous vehicles, machine learning agents deployed in the physical world (“physical agents”) promise to revolutionize endeavors ranging from manufacturing to agriculture to domestic labor. In this talk, we will develop mathematical foundations, from the ground up, for how to carry out this vision. We will begin our investigation by examining linear dynamical systems, a simple and fundamental model of the interaction between a physical agent and its environment. We prove mathematically that simple exploration attains optimal performance for some of both the simplest and the most complex learning problems in this class. The above finding, while powerful, strongly motivates moving past linear dynamics as a mathematical testbed for understanding learning with physical agents.

Hence, we turn to providing mathematical guarantees for a setting of real-world importance that does not fit the linear mold: behavior cloning. Behavior cloning — teaching a robot to imitate from example demonstrations — lies at the heart of many of today’s most promising robot learning endeavors due to its intuitive data collection and simplicity. Though it can work incredibly well, we still do not have a clear understanding of what circumstances ensure its success. Bringing together the flexibility of generative models with key intuitions arising from the study of linear control, we introduce a framework for behavior cloning that enables an agent to imitate nearly arbitrary behavior with provable guarantees, even when the dynamics governing the agent and environments interaction are nonlinear. We conclude by outlining ongoing work and future steps towards building out the mathematical and conceptual tooling for understanding the next steps towards general, capable and flexible physical agents.

Bio: Max Simchowitz is a postdoctoral researcher in the Robot Locomotion Group at MIT CSAIL. He studies the theoretical foundations of machine learning problems with a sequential or dynamical component; he currently focuses on robotics and out-of-distribution learning, and with past work ranging broadly across control, reinforcement learning, optimization and algorithmic fairness. He received his PhD from University of California, Berkeley in 2021 under Ben Recht and Michael I. Jordan, and his work has been recognized with an ICML 2018 Best Paper Award, ICML 2022 Outstanding Paper Award, and RSS 2023 Best Paper Finalist designation.

This talk is co-sponsored by the departments of Electrical and Computer Engineering and Computer Science.

To request accommodations for a disability please contact Lidia Stokman, lstokman@princeton.edu, at least one week prior to the event.

Making Language Models Useful

Date and Time

Thursday, February 29, 2024 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Eric Mitchell, from Stanford University

Host

Karthik Narasimhan

Large pre-trained language models, most notably GPT-3, are the engines of knowledge and capability underpinning powerful systems such as ChatGPT, Gemini, and Claude. Yet much like building a safe, comfortable vehicle requires more than a powerful engine, building a useful, beneficial language system requires additional techniques to promote key attributes such as controllability, factuality, and updatability. This talk will share my work towards imbuing large language models with these traits. I will first share the direct preference optimization algorithm, a more scalable algorithm for training language models to follow instructions in accordance with human preferences. I will next discuss approaches for improving the factual reliability of language models, which is challenging even for models that generally follow user instructions well. Finally, I will share my work towards methods for updating individual model behaviors or beliefs that have fallen out-of-date or are otherwise problematic. I will conclude with several important topics for future work toward more useful, trustworthy AI systems, including unsupervised continual learning, scalable oversight, and robust reasoning.

Bio: Eric Mitchell is a final-year PhD student in Stanford’s Computer Science department, advised by Chelsea Finn and Christopher Manning. His research uses tools from machine learning to improve the usefulness and reliability of language models, in particular by developing techniques that enhance their controllability, factuality, and updatability. His work has appeared in ICML, NeurIPS, ICLR, and EMNLP, being recognized with an outstanding paper runner-up award at NeurIPS ‘23. His work, in particular the direct preference optimization algorithm, has been used widely in state-of-the-art open source and proprietary language models. He is a former Knight-Hennessy Scholar and received his BS from Princeton University.

To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Secure systems from insecure components

Date and Time

Thursday, March 7, 2024 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Emma Dauterman, from UC Berkeley

Host

Amit Levy

In many computer systems today, an attacker that breaks one system component can steal data from millions of users. In this talk, I will present two systems that can withstand component compromise. I will describe (1) a single sign-on system that protects user security and privacy from a compromised single sign-on server, and (2) a secure-hardware-based backup service that protects user backups from compromised secure hardware devices. These systems provide strong security and privacy properties while taking into account practical constraints such as compatibility requirements, hardware limitations, and user expectations. Each splits user secrets across different system components, using new cryptographic tools to provide necessary functionality while protecting user data.

Bio: Emma Dauterman is a Ph.D. candidate at UC Berkeley where she is advised by Raluca Ada Popa and Ion Stoica. Her research interests include computer security, systems, and applied cryptography. She has received the Microsoft Research Ada Lovelace fellowship, the NSF graduate research fellowship, and a UC Berkeley EECS excellence award.

To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Scaling Deep Learning Up and Down

Date and Time

Tuesday, April 9, 2024 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Zhuang Liu, from Meta AI Research

Host

Jia Deng

RESCHEDULED from March 21.

Deep learning with neural networks has emerged as a key approach for discovering patterns and modeling relationships in complex data. AI systems powered by deep learning are used widely in applications across a broad spectrum of scales. There have been strong needs for scaling deep learning both upward and downward. Scaling up highlights the pursuit of scalability - the ability to utilize increasingly abundant computing and data resources to achieve superior capabilities, overcoming diminishing returns. Scaling down represents the demand for efficiency - there is limited data for many application domains, and deployment is often in compute-limited settings. My research focuses on scaling deep learning both up and down, to build capable models and understand their behaviors in different computational and data environments.

In this talk, we present studies in both directions. For scaling up, we first explore the design of scalable neural network architectures that are widely adopted in various fields. We then discuss an intriguing observation on modern vision datasets and its implication on scaling training data. For scaling down, we introduce simple, effective, and popularly used approaches for compressing convolutional networks and large language models, alongside interesting empirical findings. Notably, a recurring theme in this talk is the careful examination of implicit assumptions in the literature, which often leads to surprising revelations that reshape community understanding. Finally, we discuss exciting avenues for future deep learning and vision research, such as developing next-gen architectures and modeling datasets.

Bio: Zhuang Liu is currently a Research Scientist at Meta AI Research (FAIR) in New York City. He received his Ph.D. from UC Berkeley EECS in 2022, advised by Trevor Darrell. His research areas include deep learning and computer vision. His work focuses on scaling neural networks both up and down, to build capable models and understand their behaviors in different computational and data environments. His work is broadly applied in different areas of computing and other disciplines. He is a recipient of the CVPR 2017 Best Paper Award.

To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Rethinking Data Use in Large Language Models

Date and Time

Monday, March 4, 2024 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Sewon Min, from University of Washington

Host

Danqi Chen

Large language models (LMs) such as ChatGPT have revolutionized natural language processing and artificial intelligence more broadly. In this talk, I will discuss my research on understanding and advancing these models, centered around how they use the very large text corpora they are trained on. First, I will describe our efforts to understand how these models learn to perform new tasks after training, demonstrating that their so-called in context learning capabilities are almost entirely determined by what they learn from the training data. Next, I will introduce a new class of LMs—nonparametric LMs—that repurpose this training data as a data store from which they retrieve information for improved accuracy and updatability. I will describe my work on establishing the foundations of such models, including one of the first broadly used neural retrieval models and an approach that simplifies a traditional, two-stage pipeline into one. I will also discuss how nonparametric models open up new avenues for responsible data use, e.g., by segregating permissive and copyrighted text and using them differently. Finally, I will envision the next generation of LMs we should build, focusing on efficient scaling, improved factuality, and decentralization.

Bio: Sewon Min is a Ph.D. candidate in the Paul G. Allen School of Computer Science & Engineering at the University of Washington. Her research focuses on language models (LMs): studying the science of LMs, and designing new model classes and learning methods that make LMs more performant and flexible. She also studies LMs in information-seeking, legal, and privacy contexts. She is a co-organizer of multiple tutorials and workshops, including most recently at ACL 2023 on Retrieval-based Language Models and Applications and upcoming at ICLR 2024 on Mathematical and Empirical Understanding of Foundation Models. She won a paper award at ACL 2023, received a J.P. Morgan Fellowship, and was named an EECS rising star in 2022.

To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Instance-Optimization: Rethinking Database Design for the Next 1000X

Date and Time

Monday, February 26, 2024 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Jialin Ding, from AWS

Host

Wyatt Lloyd

Modern database systems aim to support a large class of different use cases while simultaneously achieving high performance. However, as a result of their generality, databases often achieve adequate performance for the average use case but do not achieve the best performance for any individual use case. In this talk, I will describe my work on designing databases that use machine learning and optimization techniques to automatically achieve performance much closer to the optimal for each individual use case. In particular, I will present my work on instance-optimized database storage layouts, in which the co-design of data structures and optimization policies improves query performance in analytic databases by orders of magnitude. I will highlight how these instance-optimized data layouts address various challenges posed by real-world database workloads and how I implemented and deployed them in production within Amazon Redshift, a widely-used commercial database system.

Bio: Jialin Ding is an Applied Scientist at AWS. Before that, he received his PhD in computer science from MIT, advised by Tim Kraska. He works broadly on applying machine learning and optimization techniques to improve data management systems, with a focus on building databases that automatically self-optimize to achieve high performance for any specific application. His work has appeared in top conferences such as SIGMOD, VLDB, and CIDR, and has been recognized by a Meta Research PhD Fellowship. To learn more about Jialin’s work, please visit https://jialinding.github.io/.

To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

AI Models for Edge Computing: Hardware-aware Optimizations for Efficiency

Date and Time

Monday, December 4, 2023 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Yiran Chen, from Duke University

Host

Kai Li

As artificial intelligence (AI) transforms various industries, state-of-the-art models have exploded in size and capability. The growth in AI model complexity is rapidly outstripping hardware evolution, making the deployment of these models on edge devices remain challenging. To enable advanced AI locally, models must be optimized for fitting into the hardware constraints. In this presentation, we will first discuss how computing hardware designs impact the effectiveness of commonly used AI model optimizations for efficiency, including techniques like quantization and pruning. Additionally, we will present several methods, such as hardware-aware quantization and structured pruning, to demonstrate the significance of software/hardware co-design. We will also demonstrate how these methods can be understood via a straightforward theoretical framework, facilitating their seamless integration in practical applications and their straightforward extension to distributed edge computing. At the conclusion of our presentation, we will share our insights and vision for achieving efficient and robust AI at the edge.

Bio: Yiran Chen received his B.S. (1998) and M.S. (2001) degrees from Tsinghua University and his Ph.D. (2005) from Purdue University. After spending five years in the industry, he joined the University of Pittsburgh in 2010 as an Assistant Professor and was promoted to Associate Professor with tenure in 2014, holding the Bicentennial Alumni Faculty Fellow position. He currently serves as the John Cocke Distinguished Professor of Electrical and Computer Engineering at Duke University. He is also the director of the NSF AI Institute for Edge Computing Leveraging Next-generation Networks (Athena), the NSF Industry-University Cooperative Research Center (IUCRC) for Alternative Sustainable and Intelligent Computing (ASIC), and the co-director of the Duke Center for Computational Evolutionary Intelligence (DCEI). His group's research focuses on new memory and storage systems, machine learning and neuromorphic computing, and mobile computing systems. Dr. Chen has published one book, more than 600 technical publications, and has been granted 96 US patents. He has received 11 Ten-Year Retrospective Influential Paper Awards, Outstanding Paper Awards, Best Paper Awards, and Best Student Paper Awards, as well as 2 best poster awards and 15 best paper nominations from various international journals, conferences, and workshops. He has been honored with numerous awards for his technical contributions and professional services, including the IEEE CASS Charles A. Desoer Technical Achievement Award and the IEEE Computer Society Edward J. McCluskey Technical Achievement Award. He has been a distinguished lecturer for IEEE CEDA and CAS, is a Fellow of the AAAS, ACM, and IEEE, and currently serves as the chair of ACM SIGDA and the Editor-in-Chief of the IEEE Circuits and Systems Magazine. He is a founding member of the steering committee of the Academic Alliance on AI Policy (AAAIP).

To request accommodations for a disability, please contact Emily Lawrence at emilyl@cs.princeton.edu at least one week prior to the event.
This talk will be recorded and live streamed via Zoom. Webinar registration here.

Enabling Collaboration between Creators and Generative Models

Date and Time

Thursday, November 30, 2023 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Jun-Yan Zhu, from Carnegie Mellon University

Host

Jia Deng

Large-scale generative visual models, such as DALL·E2 and Stable Diffusion, have made content creation as little effort as writing a short text description. Meanwhile, these models also spark concerns among artists, designers, and photographers about job security and proper credit for their contributions to the training data. This leads to many questions: Will generative models make creators’ jobs obsolete? Should creators stop publicly sharing their work? Should we ban generative models altogether?

In this talk, I argue that human creators and generative models can coexist. To achieve it, we need to involve creators in the loop of both model inference and model training while crediting their efforts for their involvement. I will first explore our recent efforts in model rewriting, which allows creators to freely control the model’s behavior by adding, altering, or removing concepts and rules. I will demonstrate several applications, including creating new visual effects, customizing models with multiple personal concepts, and removing copyrighted content. I will then discuss our data attribution algorithm for assessing the influence of each training image for a generated sample. Collectively, we aim to allow creators to leverage the models while retaining control over the creation process and data ownership.

Bio: Jun-Yan Zhu is an Assistant Professor at CMU’s School of Computer Science. Prior to joining CMU, he was a Research Scientist at Adobe Research and a postdoc at MIT CSAIL. He obtained his Ph.D. from UC Berkeley and B.E. from Tsinghua University. He studies computer vision, computer graphics, and computational photography. His current research focuses on generative models for visual storytelling. He has received the Packard Fellowship, the NSF CAREER Award, the ACM SIGGRAPH Outstanding Doctoral Dissertation Award, and the UC Berkeley EECS David J. Sakrison Memorial Prize for outstanding doctoral research, among other awards.

To request accommodations for a disability, please contact Emily Lawrence at emilyl@cs.princeton.edu at least one week prior to the event.
This talk will be recorded and live streamed via Zoom. Register for webinar here.

3D-aware Representation Learning for Vision

Date and Time

Thursday, October 26, 2023 - 4:30pm to 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Vincent Sitzmann, from MIT EECS

Host

Ellen Zhong

Given only a single picture, people are capable of inferring a mental representation that encodes rich information about the underlying 3D scene. We acquire this skill not through massive labeled datasets of 3D scenes, but through self-supervised observation and interaction. Building machines that can infer similarly rich neural scene representations is critical if they are to one day parallel people’s ability to understand, navigate, and interact with their surroundings. In my talk, I will discuss how this motivates a 3D approach to self-supervised learning for vision. I will then present recent advances of my research group towards enabling us to train self-supervised scene representation learning methods at scale, on uncurated video without pre-computed camera poses. I will further present recent advances towards modeling of uncertainty in 3D scenes, as well as progress on endowing neural scene representations with more semantic, high-level information.

Bio: Vincent is an Assistant Professor at MIT EECS, where he is leading the Scene Representation Group. Previously, he finished his Ph.D. at Stanford University. He is interested in the self-supervised training of 3D-aware vision models: His goal is to train models that, given a single image or short video, can reconstruct a representation of the underlying scene that incodes information about materials, affordance, geometry, lighting, etc, a task that is simple for humans, but currently impossible for AI.

To request accommodations for a disability, please contact Emily Lawrence at emilyl@cs.princeton.edu at least one week prior to the event.
This talk will be recorded and live streamed via Zoom. Register for webinar here.

Lattice-Based Cryptography and the Learning with Errors Problem

Date and Time

Tuesday, October 10, 2023 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Oded Regev, from Courant Institute of Mathematical Sciences

Host

Ran Raz

Most cryptographic protocols in use today are based on number theoretic problems such as integer factoring. I will give an introduction to lattice-based cryptography, a form of cryptography offering many advantages over the traditional number-theoretic-based ones, including conjectured security against quantum computers. The talk will mainly focus on the so-called Learning with Errors (LWE) problem. This problem has turned out to be an amazingly versatile basis for lattice-based cryptographic constructions, with hundreds of applications. I will also mention work on making cryptographic constructions highly efficient using algebraic number theory (leading to a NIST standard and implementation in browsers such as Chrome), as well as some recent applications to machine learning.

The talk will be accessible to a wide audience.

Bio: Oded Regev is a Silver Professor in the Courant Institute of Mathematical Sciences of New York University. He received his Ph.D. in computer science from Tel Aviv University in 2001 under the supervision of Yossi Azar, continuing to a postdoctoral fellowship at the Institute for Advanced Study. He is a recipient of the 2019 Simons Investigator award, the 2018 Gödel Prize, several best paper awards, and was a speaker at the 2022 International Congress of Mathematicians. His main research areas include theoretical computer science, RNA biology, quantum computation, and machine learning.

To request accommodations for a disability, please contact Emily Lawrence at emilyl@cs.princeton.edu at least one week prior to the event.

This talk will be recorded and live streamed via Zoom. Please register for Zoom webinar here.