Quick links

Talk

Manifold learning uncovers hidden structure in complex cellular state space

Date and Time
Friday, April 5, 2019 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Talk
Host
Mona Singh

David van Dijk
In the era of big biological data, there is a pressing need for methods that visualize, integrate and interpret high-throughput high-dimensional data to enable biological discovery. There are several major challenges in analyzing high-throughput biological data. These include the curse of (high) dimensionality, noise, sparsity, missing values, bias, and collection artifacts. In my work, I try to solve these problems using computational methods that are based on manifold learning. A manifold is a smoothly varying low-dimensional structure embedded within high-dimensional ambient measurement space. In my talk, I will present a number of my recently completed and ongoing projects that utilize the manifold, implemented using graph signal processing and deep learning, to understand large biomedical datasets. These include MAGIC, a data denoising and imputation method designed to ‘fix’ single-cell RNA-sequencing data, PHATE, a dimensionality reduction and visualization method specifically designed to reveal continuous progression structure, and two deep learning methods that use specially designed constraints to allow for deep interpretable representations of heterogeneous systems. I will demonstrate that these methods can give insight into diverse biological systems such as breast cancer epithelial-to-mesenchymal transition, human embryonic stem cell development, the gut microbiome, and tumor infiltrating lymphocytes. 

Lunch for talk attendees will be available at 12:00pm. 
To request accommodations for a disability, please contact Sara B. Thibeault at thibeault@princeton.edu, at least one week prior to the event.

Democratizing Web Automation: Programming for Social Scientists and Other Domain Experts

Date and Time
Thursday, April 11, 2019 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Talk
Host
Andrew Appel

Sarah Chasins
We have promised social scientists a data revolution, but it hasn’t arrived.  What stands between practitioners and the data-driven insights they want? Acquiring the data.  In particular, acquiring the social media, online forum, and other web data that was supposed to help them produce big, rich, ecologically valid datasets.  Web automation programming is resistant to high-level abstractions, so end-user programmers end up stymied by the need to reverse engineer website internals—DOM, JavaScript, AJAX.  Programming by Demonstration (PBD) offered one promising avenue towards democratizing web automation.  Unfortunately, as the web matured, the programs became too complex for PBD tools to synthesize, and web PBD progress stalled.

In this talk, I’ll describe how I reformulated traditional web PBD around the insight that demonstrations are not always the easiest way for non-programmers to communicate their intent. By shifting from a purely Programming-By-Demonstration view to a Programming-By-X view that accepts a variety of user-friendly inputs, we can dramatically broaden the class of programs that come in reach for end-user programmers. My Helena ecosystem combines (i) usable PBD-based program drafting tools, (ii) learnable programing languages, and (iii) novel programming environment interactions.  The end result: non-coders write Helena programs in 10 minutes that can handle the complexity of modern webpages, while coders attempt the same task and time out in an hour. I’ll conclude with predictions about the abstraction-resistant domains that will fall next—robotics, analysis of unstructured texts, image processing—and how hybrid PL-HCI breakthroughs will vastly expand access to programming.

Bio:
Sarah Chasins is a Ph.D. candidate at UC Berkeley, advised by Ras Bodik.  Her research interests lie at the intersection of programming languages and human-computer interaction.  Much of her work is shaped by ongoing collaborations with social scientists, data scientists, and other non-traditional programmers.  She has been awarded an NSF graduate research fellowship and a first place award in the ACM Student Research Competition. 

Lunch for talk attendees will be available at 12:00pm. 
To request accommodations for a disability, please contact Emily Lawrence, emilyl@cs.princeton.edu, 609-258-4624 at least one week prior to the event.

AlphaGo and the Computational Challenges of Machine Learning

Date and Time
Tuesday, April 9, 2019 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Talk
Host
Ryan Adams

Chris Maddison
Many computational challenges in machine learning involve the three problems of optimization, integration, and fixed-point computation. These three can often be reduced to each other, so they may also provide distinct vantages on a single problem. In this talk, I present a small part of this picture through a discussion of my work on AlphaGo and two vignettes on my work on the interplay between optimization and Monte Carlo. AlphaGo is the first computer program to defeat a world-champion player, Lee Sedol, in the board game of Go. My work laid the groundwork of the neural net components of AlphaGo, and culminated in our Nature publication describing AlphaGo's algorithm, at whose core hide these three problems. In the first vignette, I present the Hamiltonian descent methods we introduced for first-order optimization. These methods are inspired by the Monte Carlo literature and can achieve fast linear convergence without strong convexity by using a non-standard kinetic energy to condition the optimization. In the second vignette I cover our A* Sampling method, which reduces the problem of Monte Carlo simulation to an optimization problem, and an application to gradient estimation in stochastic computation graphs.

Bio: 
Chris Maddison is a PhD candidate in the Statistical Machine Learning Group in the Department of Statistics at the University of Oxford. He is an Open Philanthropy AI Fellow and spends two days a week as a Research Scientist at DeepMind. His research is broadly focused on the development of numerical methods for deep learning and machine learning. He has worked on methods for variational inference, numerical optimization, and Monte Carlo estimation with a specific focus on those that might work at scale with few assumptions. Chris received his MSc. from the University of Toronto. He received a NeurIPS Best Paper Award in 2014, and was one of the founding members of the AlphaGo project.

Lunch for talk attendees will be available at 12:00pm. 
To request accommodations for a disability, please contact Emily Lawrence, emilyl@cs.princeton.edu, 609-258-4624 at least one week prior to the event.

Learning to See the Physical World

Date and Time
Monday, April 1, 2019 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Talk
Host
Tom Griffiths

Jiajun Wu
Human intelligence is beyond pattern recognition. From a single image, we're able to explain what we see, reconstruct the scene in 3D, predict what's going to happen, and plan our actions accordingly. In this talk, I will present our recent work on physical scene understanding---building versatile, data-efficient, and generalizable machines that learn to see, reason about, and interact with the physical world. The core idea is to exploit the generic, causal structure behind the world, including knowledge from computer graphics, physics, and language, in the form of approximate simulation engines, and to integrate them with deep learning. Here, deep learning plays two major roles: first, it learns to invert simulation engines for efficient inference; second, it learns to augment simulation engines for constructing powerful forward models. I'll focus on a few topics to demonstrate this idea: building scene representation for both object geometry and physics; learning expressive dynamics models for planning and control; perception and reasoning beyond vision.

Bio: 
Jiajun Wu is a Ph.D. student in Electrical Engineering and Computer Science at Massachusetts Institute of Technology. He received his B.Eng. from Tsinghua University in 2014. His research interests lie in the intersection of computer vision, machine learning, robotics, and computational cognitive science. His research has been recognized through the IROS Best Paper Award on Cognitive Robotics and fellowships from Facebook, Nvidia, Samsung, Baidu, and Adobe, and his work has been covered by major media outlets including CNN, BBC, WIRED, and MIT Tech Review.

Lunch for talk attendees will be available at 12:00pm. 
To request accommodations for a disability, please contact Emily Lawrence, emilyl@cs.princeton.edu, 609-258-4624 at least one week prior to the event.

Protecting Privacy by Splitting Trust

Date and Time
Thursday, April 4, 2019 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Talk
Host
Arvind Narayanan

Henry Corrigan-Gibbs
When the maker of my phone, smart-watch, or web browser collects data about how I use it, must I trust the manufacturer to protect that sensitive information from theft? When I use the cryptographic hardware module in my laptop, need I trust that it will keep my secrets safe? When I use a messaging app to chat with friends, must I trust the app vendor not to sell the details of my messaging activity for profit?

This talk will show that we can get the functionality we want from our systems without having to put blind faith in the correct behavior of the companies collecting our data, building our hardware, or designing our apps. The principle is to split our trust -- among organizations, or devices, or users. I will introduce new cryptographic techniques and systems-level optimizations that make it practical to split trust in a variety of settings. Then, I will present three built systems that employ these ideas, including one that now ships with the Firefox browser.

Bio: 
Henry Corrigan-Gibbs is a Ph.D. candidate at Stanford, advised by Dan Boneh. His research interests are in computer security, applied cryptography, and online privacy. Henry and his collaborators have received the Best Young Researcher Paper Award at Eurocrypt 2018, the 2016 Caspar Bowden Award for Outstanding Research in Privacy Enhancing Technologies, and the 2015 IEEE Security and Privacy Distinguished Paper Award, and Henry's work has been cited by IETF and NIST.

Lunch for talk attendees will be available at 12:00pm. 
To request accommodations for a disability, please contact Emily Lawrence, emilyl@cs.princeton.edu, 609-258-4624 at least one week prior to the event.

The "D" word: solving for "diversity" on high-tech teams

Date and Time
Thursday, February 28, 2019 - 4:30pm to 5:30pm
Location
Bowen Hall 222
Type
Talk

Janet Vertesi
This talk gives an overview of the sociological factors that affect the construction of diverse, high-performing teams. Building on a summary of key issues that affect gender and racial disparities in high-performing occupations, the talk covers the current social science and vocabulary for addressing the problem, as well as strategies for moving forward, avoiding common traps, and protecting performance-based advancement. 

Bio:
Dubbed “Margaret Mead among the Starfleet” by the Times Literary Supplement, Janet Vertesi is Assistant Professor of Sociology at Princeton University and an expert in the sociology of science, technology, and organizations. Vertesi’s past decade of research, funded by the National Science Foundation, examines how distributed robotic spacecraft teams work together effectively to produce scientific and technical results. Her book Seeing Like a Rover (University of Chicago Press, 2015) describes the collaborative work of the Mars Exploration Rover mission including the people, the images, and the robots who do science on Mars. Vertesi is also a long-time contributor to the Association of Computing Machinery conferences on human-computer interaction and computer-supported cooperative work. She is an advisory board member of the Data and Society institute in New York City and is a member of Princeton University’s Center for Information Technology Policy.

To join the talk, please email seasdiversity@princeton.edu.

This event co-sponsored by School of Engineering and Applied Science and Department of Computer Science.

Systems to Improve Online Discussion

Date and Time
Tuesday, April 16, 2019 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Talk
Host
Adam Finkelstein

Amy Zhang
Discussions online are integral to everyday life, affecting how we learn, work, socialize, and participate in public society. Yet the systems that we use to conduct online discourse, whether they be email, chat, or forums, have changed little since their inception many decades ago. As more people participate and more venues for discourse migrate online, new problems have arisen, and old problems have intensified. People are still drowning in information and must now juggle dozens of disparate discussion silos in addition. Finally, an unfortunately significant proportion of this online interaction is unwanted or unpleasant, with clashing norms leading to people bickering or getting harassed into silence. My research in human-computer interaction is on reimagining outdated designs towards designing novel online discussion systems that fix what's broken about online discussion. To solve these problems, I develop tools that empower users and communities to have direct control over their experiences and information. These include: 1) summarization tools to make sense of large discussions, 2) annotation tools to situate conversations in the context of what is being discussed, as well as 3) moderation tools to give users more fine-grained control over content delivery. 

Bio:
Amy X. Zhang is a graduate student at MIT’s Computer Science and Artificial Intelligence Laboratory, focusing on human-computer interaction and social computing, and a 2018-19 Fellow at the Harvard Berkman Klein Center. She has interned at Microsoft Research and Google Research, received awards at ACM CHI and CSCW, and featured in stories by ABC News, BBC, CBC, and more. She has an M.Phil. in CS at University of Cambridge on a Gates Fellowship and a B.S. in CS at Rutgers, where she captained the Division I Women’s tennis team. Her research is supported by a Google PhD Fellowship and an NSF Graduate Research Fellowship.

Lunch for talk attendees will be available at 12:00pm. 
To request accommodations for a disability, please contact Emily Lawrence, emilyl@cs.princeton.edu, 609-258-4624 at least one week prior to the event.

Safe and Reliable Reinforcement Learning for Continuous Control

Date and Time
Thursday, March 7, 2019 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Talk
Host
Ryan Adams, CS and Yuxin Chen, EE

Many autonomous systems such as self-driving cars, unmanned aerial vehicles, and personalized robotic assistants are inherently complex.  In order to deal with this complexity, practitioners are increasingly turning towards data-driven learning techniques such as reinforcement learning (RL) for designing sophisticated control policies. However, there are currently two fundamental issues that limit the widespread deployment RL: sample inefficiency and the lack of formal safety guarantees. In this talk, I will propose solutions for both these issues in the context of continuous control tasks. In particular, I will show that in the widely applicable setting where the dynamics are linear, model-based algorithms which exploit this structure are substantially more sample efficient than model-free algorithms, such as the widely used policy gradient method. Furthermore, I will describe a new model-based algorithm which comes with provable safety guarantees and is computationally efficient, relying only on convex programming. I will conclude the talk by discussing the next steps towards safe and reliable deployment of reinforcement learning. 

Bio:
Stephen Tu is a PhD student in Electrical Engineering and Computer Sciences at the University of California, Berkeley advised by Benjamin Recht. His research interests are in machine learning, control theory, optimization, and statistics. Recently, he has focused on providing safety and performance guarantees for reinforcement learning algorithms in continuous settings. He is supported by a Google PhD fellowship in machine learning. 

Lunch for talk attendees will be available at 12:00pm. 
To request accommodations for a disability, please contact Emily Lawrence, emilyl@cs.princeton.edu, 609-258-4624 at least one week prior to the event.

On the Foundations of Deep Learning: SGD, Overparametrization, and Generalization

Date and Time
Monday, March 11, 2019 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Talk
Host
Elad Hazan

Jason Lee
Deep Learning has had phenomenal empirical successes in many domains including computer vision, natural language processing, and speech recognition. To consolidate and boost the empirical success, we need to develop a more systematic and deeper understanding of the elusive principles of deep learning.

In this talk, I will provide analysis of several elements of deep learning including non-convex optimization, overparametrization, and generalization error. First, we show that gradient descent and many other algorithms are guaranteed to converge to a local minimizer of the loss. For several interesting problems including the matrix completion problem, this guarantees that we converge to a global minimum. Then we will show that gradient descent converges to a global minimizer for deep overparametrized networks. Finally, we analyze the generalization error by showing that a subtle combination of SGD, logistic loss, and architecture combine to promote large margin classifiers, which are guaranteed to have low generalization error. Together, these results show that on overparametrized deep networks SGD finds solution of both low train and test error.

Bio:
Jason Lee is an assistant professor in Data Sciences and Operations at the University of Southern California. Prior to that, he was a postdoctoral researcher at UC Berkeley working with Michael Jordan. Jason received his PhD at Stanford University advised by Trevor Hastie and Jonathan Taylor. His research interests are in statistics, machine learning, and optimization. Lately, he has worked on the foundations of deep learning, non-convex optimization algorithm, and adaptive statistical inference. He has received a Sloan Research Fellowship in 2019 and NIPS Best Student Paper Award for his work.  

Lunch for talk attendees will be available at 12:00pm. 
To request accommodations for a disability, please contact Emily Lawrence, emilyl@cs.princeton.edu, 609-258-4624 at least one week prior to the event.

Optimizing the Automated Programming Stack

Date and Time
Thursday, April 18, 2019 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Talk
Host
Margaret Martonosi

James Bornholt
The scale and pervasiveness of modern software poses a challenge for programmers: software reliability is more important than ever, but the complexity of computer systems continues to grow. Automated programming tools are powerful weapons for programmers to tackle this challenge: verifiers that check software correctness, and synthesizers that generate new correct-by-construction programs. These tools are most effective when they apply domain-specific optimizations, but doing so today requires considerable formal methods expertise.

In this talk, I present a new application-driven approach to optimizing the automated programming stack underpinning modern domain-specific tools. I will demonstrate the importance of programming tools in the context of memory consistency models, which define the behavior of multiprocessor CPUs and whose subtleties often elude even experts. Our new tool, MemSynth, automatically synthesizes formal descriptions of memory consistency models from examples of CPU behavior. We have used MemSynth to synthesize descriptions of the x86 and PowerPC memory models, each of which previously required person-years of effort to describe by hand, and found several ambiguities and underspecifications in both architectures. I will then present symbolic profiling, a new technique we designed and implemented to help people identify the scalability bottlenecks in automated programming tools. These tools use symbolic evaluation, which evaluates all paths through a program, and is an execution model that defies both human intuition and standard profiling techniques. Symbolic profiling diagnoses scalability bottlenecks using a novel performance model for symbolic evaluation that accounts for all-paths execution. We have used symbolic profiling to find and fix performance issues in 8 state-of-the-art automated tools, improving their scalability by orders of magnitude, and our techniques have been adopted in industry. Finally, I will give a sense of the importance of future application-driven optimizations to the automated programming stack, with applications that inspire improvements to the stack and in turn beget even more powerful automated tools.

Bio:
James Bornholt is a Ph.D. candidate in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, advised by Emina Torlak, Dan Grossman, and Luis Ceze. His research interests are in programming languages and formal methods, with a focus on automated program verification and synthesis. His work has received an ACM SIGPLAN Research Highlight, two IEEE Micro Top Picks selections, an OSDI best paper award, and a Facebook Ph.D. fellowship.

Lunch for talk attendees will be available at 12:00pm. 
To request accommodations for a disability, please contact Emily Lawrence, emilyl@cs.princeton.edu, 609-258-4624 at least one week prior to the event.

Follow us: Facebook Twitter Linkedin