Quick links

CS Department Colloquium Series

Distributionally Robust Machine Learning

Date and Time
Tuesday, March 26, 2024 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Ellen Zhong

Shiori Sagawa
Machine learning models are widely deployed today, but they can fail due to distribution shifts: mismatches in the data distribution between training and deployment. Models can fail on certain subpopulations (e.g., language models can fail on non-English languages) and on new domains unseen during training (e.g., medical models can fail on new hospitals). In this talk, I will discuss my work on algorithms for improving robustness to distribution shifts. First, to mitigate subpopulation shifts, I develop methods that leverage distributionally robust optimization (DRO). My methods overcome the computational and statistical obstacles of applying DRO on modern neural networks and on real-world shifts. Second, to tackle domain shifts, I build WILDS, a benchmark of real-world shifts, and show that existing methods fail on WILDS even though they perform well on synthetic shifts from prior benchmarks. I then develop a state-of-the-art method that successfully mitigates real-world domain shifts; my method proposes an alternative to domain invariance—a key principle behind the prior methods—to reflect the structure of real-world shifts. Altogether, my algorithms improve robustness to a wide range of distribution shifts in the wild, from subpopulation shifts in language modeling to domain shifts in wildlife monitoring and histopathology.

Bio: Shiori Sagawa is a final-year PhD Candidate in Computer Science at Stanford University, advised by Percy Liang. Her research focuses on algorithms for reliable machine learning. She was awarded the Stanford Graduate Fellowship and an Apple Scholars in AI/ML PhD Fellowship. Prior to her PhD, she received her B.A. in Computer Science and Molecular and Cell Biology from UC Berkeley, and she worked at D. E. Shaw Research.


To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Stochastic Computer Graphics

Date and Time
Monday, March 25, 2024 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Adam Finkelstein

Silvia Sellan
Computer Graphics research has long been dominated by the interests of large film, television and social media companies, forcing other, more safety-critical applications (e.g., medicine, engineering, security) to repurpose Graphics algorithms originally designed for entertainment. In this talk, I will advocate for a perspective shift in our field that allows us to design algorithms directly for these safety-critical application realms. I will show that this begins by reinterpreting traditional Graphics tasks (e.g., 3D modeling and reconstruction) from a statistical lens and quantifying the uncertainty in our algorithmic outputs, as exemplified by the research I have conducted for the past five years. I will end by mentioning several ongoing and future research directions that carry this statistical lens to entirely new problems in Graphics and Vision and into specific applications.

Bio: Silvia is a fifth year Computer Science PhD student at the University of Toronto, working in Computer Graphics and Geometry Processing. She is a Vanier Doctoral Scholar, an Adobe Research Fellow and the winner of the 2021 University of Toronto Arts & Science Dean’s Doctoral Excellence Scholarship. She has interned twice at Adobe Research and twice at the Fields Institute of Mathematics. She is also a founder and organizer of the Toronto Geometry Colloquium and a member of WiGRAPH.


To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Generalizing Beyond the Training Distribution through Compositional Generation

Date and Time
Thursday, April 4, 2024 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Felix Heide

Yilun Du
Generative AI has led to stunning successes in recent years but is fundamentally limited by the amount of data available.  This is especially limiting in the embodied setting – where an agent must solve new tasks in new environments. In this talk, I’ll introduce the idea of compositional generative modeling, which enables generalization beyond the training data by building complex generative models from smaller constituents. I’ll first introduce the idea of energy-based models and illustrate how they enable compositional generative modeling. I’ll then illustrate how such compositional models enable us to synthesize complex plans for unseen tasks at inference time. Finally, I'll show how such compositionality can be applied to multiple foundation models trained on various forms of Internet data, enabling us to construct decision-making systems that can hierarchically plan and solve long-horizon problems in a zero-shot manner.

Bio: Yilun Du is final year PhD student at MIT CSAIL advised by Leslie Kaelbling, Tomas Lozano-Perez and Joshua Tenenbaum. His research spans the fields of machine learning and robotics, with a focus on generative models.  He is supported by the NSFGraduate Research Fellowship and was previously a research fellow at OpenAI, a visiting researcher at FAIR and a student researcher at Google Deepmind.


To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Scalable and Efficient Systems for Large Language Models

Date and Time
Tuesday, March 5, 2024 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Ravi Netravali

Lianmin Zheng
Large Language Models (LLMs) have been driving recent breakthroughs in AI. These advancements would not have been possible without the support of scalable and efficient infrastructure systems. In this talk, I will introduce several underlying systems I have designed and built to support the entire model lifecycle, from training to deployment to evaluation. First, I will present Alpa, a system for large-scale model-parallel training that automatically generates execution plans unifying data, operator, and pipeline parallelism. Next, I will discuss efficient deployment systems, covering the frontend programming interface and backend runtime optimizations for high-performance inference. Finally, I will complete the model lifecycle by presenting our model evaluation efforts, including the crowdsourced live benchmark platform, Chatbot Arena, and the automatic evaluation pipeline, LLM-as-a-Judge. These projects have collectively laid a solid foundation for large language model systems, being widely adopted by leading LLM developers and companies. I will conclude by outlining some future directions of machine learning systems, such as co-optimizing across the full stack for building AI-centric applications.

Bio: Lianmin Zheng is a Ph.D. student in the EECS department at UC Berkeley, advised by Ion Stoica and Joseph E. Gonzalez. His research interests include machine learning systems, large language models, compilers, and distributed systems. He builds full-stack, scalable, and efficient systems to advance the development of AI. He co-founded LMSYS.org, where he leads impactful open-source large language model projects such as Vicuna and Chatbot Arena, which have received millions of downloads and served millions of users. He also co-organized the Big Model Tutorial at ICML 2022. He has received a Meta Ph.D. Fellowship, an IEEE Micro Best Paper Award, and an a16z open-source AI grant.


To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Catch M(oor)e If You Can: Agile Hardware/Software Co-Design for Hyperscale Cloud Systems

Date and Time
Wednesday, April 3, 2024 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Margaret Martonosi

Global reliance on cloud services, powered by transformative technologies like generative AI, machine learning, and big-data analytics, is driving exponential growth in demand for hyperscale cloud compute infrastructure. Meanwhile, the breakdown of classical hardware scaling (e.g., Moore's Law) is hampering growth in compute supply. Building domain-specific hardware can address this supply-demand gap, but catching up with exponential demand requires developing new hardware rapidly and with confidence that performance/efficiency gains will compound in the context of a complete system. These are challenging tasks given the status quo in hardware design, even before accounting for the immense scale of cloud systems.

Sagar Karandikar
This talk will focus on two themes of my work: (1) Developing radical new agile, end-to-end hardware/software co-design tools that challenge the status quo in hardware design for systems of all scales and unlock the ability to innovate on new hardware at datacenter scale. (2) Leveraging these tools and insights from hyperscale datacenter fleet profiling to architect and implement state-of-the-art domain-specific hardware that addresses key efficiency challenges in hyperscale cloud systems.

I will first cover my work creating the award-winning and widely used FireSim FPGA-accelerated hardware simulation platform, which provides unprecedented hardware/software co-design capabilities. FireSim automatically constructs high-performance, cycle-exact, scale-out simulations of novel hardware designs derived from the tapeout-friendly RTL code that describes them, empowering hardware designers and domain experts alike to directly iterate on new hardware designs in hours rather than years. FireSim also unlocks innovation in datacenter hardware with the unparalleled ability to scale to massive, distributed simulations of thousand-node networked datacenter clusters with specialized server designs and complete control over the datacenter architecture. I will then briefly cover my work co-creating the also widely used Chipyard platform for agile construction, simulation (including FireSim), and tape-out of specialized RISC-V System-on-Chip (SoC) designs using a novel, RTL-generator-driven approach.

Next, I will discuss my work in collaboration with Google on Hyperscale SoC, a cloud-optimized server chip built, evaluated, and taped-out with FireSim and Chipyard. Hyperscale SoC includes my work on several novel domain-specific accelerators (DSAs) for expensive but foundational operations in hyperscale servers, including (de)serialization, (de)compression, and more. Hyperscale SoC demonstrates a new paradigm of data-driven, end-to-end hardware/software co-design, combining key insights from profiling Google's world-wide datacenter fleet with the ability to rapidly build and evaluate novel hardware designs in FireSim/Chipyard. This instance of Hyperscale SoC is just the beginning; I will conclude by covering the wide-ranging opportunities that can now be explored for radically redesigning next generation hyperscale cloud datacenters.

Bio: Sagar Karandikar is a Ph.D. Candidate at UC Berkeley and a Student Researcher at Google. His work broadly focuses on co-designing hardware and software to build next generation hyperscale cloud systems. He is also interested in agile, open-source hardware development methodologies.

His first-author publications have received several honors, including being selected for the ISCA@50 25-year Retrospective, as an IEEE Micro Top Pick, as an IEEE Micro Top Pick Honorable Mention, and as the MICRO '21 Distinguished Artifact Award winner.

He created and leads the FireSim project, which has been used as a foundational research platform in over 50 peer-reviewed publications from first authors at over 20 institutions. FireSim has also been used in the development of commercially available chips and as a standard host platform for DARPA and IARPA programs. He is a co-creator and co-lead of the also widely used Chipyard RISC-V System-on-Chip (SoC) development platform. His work on Hyperscale SoC has been influential at Google and more broadly across other silicon vendors. He was selected as a 2022 DARPA Riser and received the UC Berkeley Outstanding Graduate Student Instructor (TA) Award. He received his M.S. and B.S. from UC Berkeley.


To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Mathematical Foundations for Physical Agents

Date and Time
Thursday, March 21, 2024 - 4:30pm to 5:30pm
Location
Engineering Quadrangle B205
Type
CS Department Colloquium Series
Speaker
Max Simchowitz, from Massachusetts Institute of Technology
Host
Elad Hazan, Chi Jin

Max Simchowitz
From robotics to autonomous vehicles, machine learning agents deployed in the physical world (“physical agents”) promise to revolutionize endeavors ranging from manufacturing to agriculture to domestic labor. In this talk, we will develop mathematical foundations, from the ground up, for how to carry out this vision. We will begin our investigation by examining linear dynamical systems, a simple and fundamental model of the interaction between a physical agent and its environment. We prove mathematically that simple exploration attains optimal performance for some of both the simplest and the most complex learning problems in this class. The above finding, while powerful, strongly motivates moving past linear dynamics as a mathematical testbed for understanding learning with physical agents.

Hence, we turn to providing mathematical guarantees for a setting of real-world importance that does not fit the linear mold: behavior cloning. Behavior cloning — teaching a robot to imitate from example demonstrations — lies at the heart of many of today’s most promising robot learning endeavors due to its intuitive data collection and simplicity. Though it can work incredibly well, we still do not have a clear understanding of what circumstances ensure its success. Bringing together the flexibility of generative models with key intuitions arising from the study of linear control,  we introduce a framework for behavior cloning that enables an agent to imitate nearly arbitrary behavior with provable guarantees, even when the dynamics governing the agent and environments interaction are nonlinear. We conclude by outlining ongoing work and future steps towards building out the mathematical and conceptual tooling for understanding the next steps towards general, capable and flexible physical agents.

Bio: Max Simchowitz is a postdoctoral researcher in the Robot Locomotion Group at MIT CSAIL. He studies the theoretical foundations of machine learning problems with a sequential or dynamical component; he currently focuses on robotics and out-of-distribution learning, and with past work ranging broadly across control, reinforcement learning, optimization and algorithmic fairness. He received his PhD from University of California, Berkeley in 2021 under Ben Recht and Michael I. Jordan, and his work has been recognized with an ICML 2018 Best Paper Award, ICML 2022 Outstanding Paper Award, and RSS 2023 Best Paper Finalist designation.


This talk is co-sponsored by the departments of Electrical and Computer Engineering and Computer Science.

To request accommodations for a disability please contact Lidia Stokman, lstokman@princeton.edu, at least one week prior to the event.

Making Language Models Useful

Date and Time
Thursday, February 29, 2024 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Karthik Narasimhan

Eric Mitchell
Large pre-trained language models, most notably GPT-3, are the engines of knowledge and capability underpinning powerful systems such as ChatGPT, Gemini, and Claude. Yet much like building a safe, comfortable vehicle requires more than a powerful engine, building a useful, beneficial language system requires additional techniques to promote key attributes such as controllability, factuality, and updatability. This talk will share my work towards imbuing large language models with these traits. I will first share the direct preference optimization algorithm, a more scalable algorithm for training language models to follow instructions in accordance with human preferences. I will next discuss approaches for improving the factual reliability of language models, which is challenging even for models that generally follow user instructions well. Finally, I will share my work towards methods for updating individual model behaviors or beliefs that have fallen out-of-date or are otherwise problematic. I will conclude with several important topics for future work toward more useful, trustworthy AI systems, including unsupervised continual learning, scalable oversight, and robust reasoning.

Bio: Eric Mitchell is a final-year PhD student in Stanford’s Computer Science department, advised by Chelsea Finn and Christopher Manning. His research uses tools from machine learning to improve the usefulness and reliability of language models, in particular by developing techniques that enhance their controllability, factuality, and updatability. His work has appeared in ICML, NeurIPS, ICLR, and EMNLP, being recognized with an outstanding paper runner-up award at NeurIPS ‘23. His work, in particular the direct preference optimization algorithm, has been used widely in state-of-the-art open source and proprietary language models. He is a former Knight-Hennessy Scholar and received his BS from Princeton University.


To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Secure systems from insecure components

Date and Time
Thursday, March 7, 2024 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Amit Levy

Emma Dauterman
In many computer systems today, an attacker that breaks one system component can steal data from millions of users. In this talk, I will present two systems that can withstand component compromise. I will describe (1) a single sign-on system that protects user security and privacy from a compromised single sign-on server, and (2) a secure-hardware-based backup service that protects user backups from compromised secure hardware devices. These systems provide strong security and privacy properties while taking into account practical constraints such as compatibility requirements, hardware limitations, and user expectations. Each splits user secrets across different system components, using new cryptographic tools to provide necessary functionality while protecting user data.

Bio: Emma Dauterman is a Ph.D. candidate at UC Berkeley where she is advised by Raluca Ada Popa and Ion Stoica. Her research interests include computer security, systems, and applied cryptography. She has received the Microsoft Research Ada Lovelace fellowship, the NSF graduate research fellowship, and a UC Berkeley EECS excellence award.


To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Scaling Deep Learning Up and Down

Date and Time
Thursday, March 21, 2024 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Jia Deng

Zhuang Liu
Deep learning with neural networks has emerged as a key approach for discovering patterns and modeling relationships in complex data. AI systems powered by deep learning are used widely in applications across a broad spectrum of scales. There have been strong needs for scaling deep learning both upward and downward. Scaling up highlights the pursuit of scalability - the ability to utilize increasingly abundant computing and data resources to achieve superior capabilities, overcoming diminishing returns. Scaling down represents the demand for efficiency - there is limited data for many application domains, and deployment is often in compute-limited settings. My research focuses on scaling deep learning both up and down, to build capable models and understand their behaviors in different computational and data environments.

In this talk, we present studies in both directions. For scaling up, we first explore the design of scalable neural network architectures that are widely adopted in various fields. We then discuss an intriguing observation on modern vision datasets and its implication on scaling training data. For scaling down, we introduce simple, effective, and popularly used approaches for compressing convolutional networks and large language models, alongside interesting empirical findings. Notably, a recurring theme in this talk is the careful examination of implicit assumptions in the literature, which often leads to surprising revelations that reshape community understanding. Finally, we discuss exciting avenues for future deep learning and vision research, such as developing next-gen architectures and modeling datasets.

Bio: Zhuang Liu is currently a Research Scientist at Meta AI Research (FAIR) in New York City. He received his Ph.D. from UC Berkeley EECS in 2022, advised by Trevor Darrell. His research areas include deep learning and computer vision. His work focuses on scaling neural networks both up and down, to build capable models and understand their behaviors in different computational and data environments. His work is broadly applied in different areas of computing and other disciplines. He is a recipient of the CVPR 2017 Best Paper Award.


To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Rethinking Data Use in Large Language Models

Date and Time
Monday, March 4, 2024 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Danqi Chen

Sewon Min
Large language models (LMs) such as ChatGPT have revolutionized natural language processing and artificial intelligence more broadly. In this talk, I will discuss my research on understanding and advancing these models, centered around how they use the very large text corpora they are trained on. First, I will describe our efforts to understand how these models learn to perform new tasks after training, demonstrating that their so-called in context learning capabilities are almost entirely determined by what they learn from the training data. Next, I will introduce a new class of LMs—nonparametric LMs—that repurpose this training data as a data store from which they retrieve information for improved accuracy and updatability. I will describe my work on establishing the foundations of such models, including one of the first broadly used neural retrieval models and an approach that simplifies a traditional, two-stage pipeline into one. I will also discuss how nonparametric models open up new avenues for responsible data use, e.g., by segregating permissive and copyrighted text and using them differently. Finally, I will envision the next generation of LMs we should build, focusing on efficient scaling, improved factuality, and decentralization.

Bio: Sewon Min is a Ph.D. candidate in the Paul G. Allen School of Computer Science & Engineering at the University of Washington. Her research focuses on language models (LMs): studying the science of LMs, and designing new model classes and learning methods that make LMs more performant and flexible. She also studies LMs in information-seeking, legal, and privacy contexts. She is a co-organizer of multiple tutorials and workshops, including most recently at ACL 2023 on Retrieval-based Language Models and Applications and upcoming at ICLR 2024 on Mathematical and Empirical Understanding of Foundation Models. She won a paper award at ACL 2023, received a J.P. Morgan Fellowship, and was named an EECS rising star in 2022.


To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Follow us: Facebook Twitter Linkedin