Quick links

Colloquium

Farewell to Servers: Software and Hardware Approaches towards Datacenter Resource Disaggregation

Date and Time
Tuesday, May 21, 2019 - 1:30pm to 2:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Host
Amit Levy

Yiying Zhang
Datacenters have been using the "monolithic" server model for decades, where each server hosts a set of hardware devices like CPU and DRAM on a motherboard and runs an OS on top to manage the hardware resources. This monolithic server model fundamentally restricts datacenters from achieving efficient resource packing, hardware rightsizing, and great heterogeneity. Recent hardware and application trends such as serverless computing further call for a rethinking of the long-standing server-centric model. My answer is to "disaggregate" monolithic servers into network-attached hardware components that host different hardware resources and offer different functionalities (e.g., a processor component for computation, a memory component for fast data accesses). I believe that after evolving from physical (DC-1.0) to virtual (DC-2.0), datacenters should evolve further into a disaggregated one (DC-3.0), where hardware resources can be allocated and scaled to the exact amount that applications use and can be individually managed and customized for different application needs. By not having servers, DC-3.0 disrupts designs and technologies in almost every layer in today's datacenters, from hardware and networking to OS and applications. My lab undertook pioneering efforts in building an end-to-end solution for DC-3.0 with a new OS, a new hardware platform, and a new network system.

This talk will focus on two systems that are central to the design of DC-3.0: 1) LegoOS, a new distributed operating system designed for managing disaggregated resources. LegoOS splits OS functionalities into different units, each running at a hardware component and managing the component's hardware resources. LegoOS enables the disaggregation and customization of OS functionalities, a significant step towards building DC-3.0's software infrastructure. 2) LegoFPGA, a new approach of using FPGA to efficiently manage and virtualize hardware resources. LegoFPGA offers a solution to co-design application, OS, and hardware functionalities and customize them for different hardware resources and application domains, an important step towards building DC-3.0's hardware infrastructure. With LegoOS and LegoFPGA, we demonstrate that separating core OS and hardware functionalities is not only feasible but can largely improve performance per dollar over the current datacenter monolithic server model.

Bio:
Yiying Zhang is an assistant professor in the School of Electrical and Computer Engineering at Purdue University. Her research interests span operating systems, distributed systems, computer architecture, and datacenter networking. She also works on the intersection of systems and programming language, security, and AI/ML. She won an OSDI best paper award in 2018 and an NSF CAREER award in 2019. Yiying’s lab is among the few groups in the world now that build new OSes and full-stack, cross-layer systems. Yiying received her Ph.D. from the Department of Computer Sciences at the University of Wisconsin-Madison under the supervision of Andrea and Remzi Arpaci-Dusseau and worked as a postdoctoral scholar at the University of California, San Diego before joining Purdue.

To request accommodations for a disability, please contact Emily Lawrence at emilyl@cs.princeton.edu, at least one week prior to the event.

System and Architecture Design for Safe and Reliable Autonomous Robotic Applications

Date and Time
Tuesday, May 14, 2019 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Host
Margaret Martonosi

Jishen Zhao
The rapid development of smart technology in edge computing systems has paved the way for us to embrace the technology movement of self-driving cars and autonomous service robots. To enable the wide adoption of these autonomous robotic applications, reliability is one of fundamental goals of computing system and architecture design. In this talk, I will present our recent exploration of safe and reliable system and architecture design for autonomous robotic applications. I will start by presenting an architecture design of supporting fast system recovery with persistent memory at low performance cost. To evaluate and guide our system design, I will introduce our safety model and architecture design strategies for self-driving cars, based on our field study of running real industrial Level-4 autonomous driving fleets. Finally, I will describe a Linux-container-based resource management framework design to improve reliability and safety of self-driving cars and service robots.

Bio: Jishen Zhao is an Assistant Professor in the Computer Science and Engineering Department at University of California, San Diego. Her research spans and stretches the boundary between computer architecture and system software, with a particular emphasis on memory and storage systems, domain-specific acceleration, and system reliability. Her research is driven by both emerging technologies (e.g., nonvolatile memories, 3D-stacked memory) and modern applications (e.g., smart home and autonomous robotic systems, deep learning, and big-data analytics). Before joining UCSD, she was an Assistant Professor at UC Santa Cruz, and a research scientist at HP Labs before joining UCSC. She is a recipient of NSF CAREER award and a MICRO best paper honorable mention award.

Lunch will be available at 12:00pm
To request accommodations for a disability, please contact Emily Lawrence at emilyl@cs.princeton.edu, at least one week prior to the event.

CacheLib - Unifying & Abstracting HW for caching at Facebook

Date and Time
Friday, May 3, 2019 - 12:30pm to 1:30pm
Location
Computer Science 302
Type
Colloquium
Speaker
Michael Uhlar and Sathya Gunasekar, from Facebook
Host
Wyatt Lloyd

In order to operate with high efficiency, Facebook’s infrastructure relies on caching in many different backend services. These services place very different demands on their caches, e.g., in terms of working set sizes, access patterns, and throughput requirements.  Historically, each service used a different cache implementation, leading to inefficiency, duplicated code and  effort. 

CacheLib is an embedded caching engine, which addresses this requirement with a unified API for building a cache implementation across many HW mediums. CacheLib transparently combines volatile and non-volatile storage in a single caching abstraction.  To meet the varied demands, CacheLib successfully provides a flexible, high-performance solution for many different services at Facebook.  In this talk, we describe CacheLib’s design, challenges, and several lessons learned.

To request accommodations for a disability, please contact Emily Lawrence at emilyl@cs.princeton.edu, at least one week prior to the event.

Earable Computers : Ear-worn Systems for Healthcare, HCI, BCI, and Brain Stimulations

Date and Time
Thursday, February 7, 2019 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Host
Jennifer Rexford

Tam Vu
This talk introduces the concept of "Earable computers", small computing and actuating devices that are worn inside, behind, around, or on user's ears. Earable sensing and actuation are motivated from the fact that human ears are relatively close to the sources of many important physiological signals such as the brain, eyes, facial muscles, heart, core body temperature, and more. Therefore, placing the sensors and associated stimulators inside the ear canals or behind the ears could open up a wide range of applications from improving cognitive function, keeping truck drivers from falling asleep while driving, extending attention span, to quantifying pain and suffering, reducing opioid use, suppressing seizure, just to name a few. This talk will discuss the opportunities that earable systems could bring and system challenges that need to be addressed to unleash its potentials. I will share our experience and lessons learned through realizing such systems in the context of human computer interaction, brain computer interaction, and healthcare.  

Bio: 
Tam Vu is an Assistant Professor of Computer Science Department at University of Colorado, Boulder. He directs Mobile and Networked Systems (MNS) Lab at the university, where he and his team conduct system research in the areas of wearable and mobile systems including mobile healthcare, mobile security, cyber physical system, and wireless sensing. His research has been recognized with a NSF CAREER award, two Google Faculty Awards, ten best paper awards, best paper nomination, and research highlights in flagship venues in mobile system research including MobiCom, MobiSys, and SenSys. He is also actively pushing his research outcomes to practice through technology transfer activities with 17 patents filed and leading two start-ups that he co-founded to commercialize them.

To request accommodations for a disability, please contact Emily Lawrence, emilyl@cs.princeton.edu, 609-258-4624 at least one week prior to the event.

Hardware is the New Software: Finding Exploitable Bugs in Hardware Designs

Date and Time
Monday, February 4, 2019 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Host
Margaret Martonosi

Cynthia Sturton
Bugs in hardware designs can create vulnerabilities that open the machine to malicious exploit. Despite mature functional validation tools and new research in designing secure hardware, the question of how to find and recognize those bugs remains open. My students and I have developed two tools in response to this question. The first is a security specification miner; it semi-automatically identifies security-critical properties of a design specified at the register transfer level. The second tool, Coppelia, is a symbolic execution engine that explores a hardware design and generates complete exploits for the security bugs it finds. We use Coppelia and our set of generated security properties to find new bugs in the open-source RISC-V and OR1k CPU architectures.

Bio:
Cynthia Sturton is an Assistant Professor and Peter Thacher Grauer Fellow at the University of North Carolina at Chapel Hill. She leads the Hardware Security @ UNC research group to investigate the use of static and dynamic analysis to protect against vulnerable hardware designs. Her research is funded by several National Science Foundation awards, the Semiconductor Research Corporation, Intel, a Junior Faculty Development Award from the University of North Carolina, and a Google Faculty Research Award. She was recently awarded the Computer Science Departmental Teaching Award at the University of North Carolina. Sturton received her B.S.E. from Arizona State University and her M.S. and Ph.D. from the University of California, Berkeley. 

Lunch for talk attendees will be available at 12:00pm

***CANCELED*** Make Your Database Dream of Electric Sheep: Designing for Autonomous Operation

Date and Time
Friday, November 16, 2018 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium

Andy Pavlo
***DUE TO WEATHER THIS TALK HAS BEEN CANCELED***

In the last 20 years, researchers and vendors have built advisory tools to assist DBAs in tuning and physical design. Most of this previous work is incomplete because they require humans to make the final decisions about any database changes and are reactionary measures that fix problems after they occur. What is needed for a "self-driving" DBMS are components that are designed for autonomous operation. This will enable new optimizations that are not possible today because the complexity of managing these systems has surpassed the abilities of humans.

In this talk, I present the core design principles of an autonomous DBMS. These are necessary to support ample data collection, fast state changes, and accurate reward observations. I will discuss techniques on how to build a new autonomous DBMS or the steps needed to retrofit an existing one to enable automated management. Our work is based on our experiences at CMU from developing an automatic tuning service (OtterTune) and our self-driving DBMS (Peloton).

Bio:
Andy Pavlo is an Assistant Professor of Databaseology in the Computer Science Department at Carnegie Mellon University. He also used to raise clams.

Datacenters and Energy Efficiency: A Game-Theoretic Perspective

Date and Time
Tuesday, May 1, 2018 - 4:30pm to 5:30pm
Location
Andlinger Center Maeder Hall
Type
Colloquium

Sharing datacenter hardware improves energy efficiency, but whether strategic users participate in consolidated systems depends on management policies. Users who dislike allocations may refuse to participate and deploy private, less-efficient systems. We rethink systems management, drawing on game theory to model strategic behavior and incentivize participation. We illustrate this perspective for two fundamental challenges in datacenters.  For power delivery, we design sprinting games to produce equilibria in which users selfishly draw power for performance boosts yet avoid oversubscribing the shared supply. For resource allocation, we use Cobb-Douglas utility functions to produce fair allocations that incentivize users to share cache and memory. These solutions provide foundations for rigorously managing systems shared by strategic, competitive participants. 
 
Bio:
Benjamin Lee is an Associate Professor of Electrical and Computer Engineering at Duke University. He received his B.S. from the University of California at Berkeley (2004), his Ph.D. from Harvard University (2008), and his post-doctorate from Stanford University (2010). He has held visiting positions at Microsoft Research, Intel Labs, and Lawrence Livermore National Lab. Dr. Lee’s research interests include computer architecture, energy efficiency, and security / privacy. He pursues these interests by building interdisciplinary links to statistical inference and algorithmic economics. His research has been recognized by IEEE Micro Top Picks (4x), Communications of the ACM Research Highlights (3x), as well as paper honors from the ASPLOS, HPCA, MICRO, and SC conferences. He received the NSF Computing Innovation Fellowship, NSF CAREER Award, and Google Faculty Research Award.

"Quantum Supremacy" and the Complexity of Random Circuit Sampling

Date and Time
Monday, April 23, 2018 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Speaker
Host
Sanjeev Arora

A critical goal for the field of quantum computation is quantum supremacy -- a demonstration of any quantum computation that is prohibitively hard for classical computers. Besides dispelling any skepticism about the viability of quantum computers, quantum supremacy also provides a test of quantum theory in the realm of high complexity. A leading near-term candidate, put forth by the Google/UCSB team, is sampling from the probability distributions of randomly chosen quantum circuits, called Random Circuit Sampling (RCS).

While RCS was defined with experimental realization in mind (the first results are expected later this year), we give the first complexity-theoretic evidence of classical hardness of RCS, placing it on par with the best theoretical proposals for supremacy. Specifically, we show that RCS satisfies an average-case hardness condition -- computing output probabilities of typical quantum circuits is as hard as computing them in the worst-case, and therefore #P-hard. Our reduction exploits the polynomial structure in the output amplitudes of random quantum circuits, enabled by the Feynman path integral. We also describe a new verification measure which in some formal sense maximizes the information gained from experimental samples.

Based on joint work with Adam Bouland, Bill Fefferman and Chinmay Nirkhe. 

Deep Learning and Cognition

Date and Time
Thursday, November 16, 2017 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Host
Dr. Sanjeev Arora

Christopher Manning
Deep learning has had enormous success on perceptual tasks but still struggles in providing a model for inference. To address this gap, we have been developing Compositional Attention Networks (CANs). The CAN design provides a strong prior for explicitly iterative reasoning, enabling it to support explainable and structured learning, as well as generalization from a modest amount of data. The model builds on the great success of existing recurrent cells such as LSTMs: A CAN is a sequence of a single recurrent Memory, Attention, and Control (MAC) cell, and by careful design imposes structural constraints on the operation of each cell and the interactions between them, incorporating explicit control and soft attention mechanisms into their interfaces. We demonstrate the model’s strength and robustness on the challenging CLEVR dataset for visual reasoning (Johnson et al. 2016), achieving a new state-of-the-art 98.9% accuracy, halving the error rate of the previous best model. More importantly, we show that the new model is more computationally efficient and data-efficient, requiring an order of magnitude less time and/or data to achieve good results. Joint work with Drew Arad.

Bio: Christopher Manning is the Thomas M. Siebel Professor in Machine Learning, Linguistics and Computer Science at Stanford University. He works on software that can intelligently process, understand, and generate human language material.  He is a leader in applying Deep Learning to Natural Language Processing, including exploring Tree Recursive Neural Networks, sentiment analysis, neural network dependency parsing, the GloVe model of word vectors, neural machine translation, and deep language understanding. He also focuses on computational linguistic approaches to parsing, robust textual inference and multilingual language processing, including being a principal developer of Stanford Dependencies and Universal Dependencies. Manning is an ACM Fellow, a AAAI Fellow, an ACL Fellow, and a Past President of ACL. He has coauthored leading textbooks on statistical natural language processing and information retrieval. He is a member of the Stanford NLP group (@stanfordnlp) and manages development of the Stanford CoreNLP software.

Data+

Date and Time
Tuesday, February 14, 2017 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Host
Amit Singer, PACM CS CSML

In this talk I will explore how data science is changing the way we practice education and research.

My title Data+ is that of a ten-week summer research experience offered by the Information Institute at Duke (iiD) in which undergraduates join small project teams, working alongside other teams in a communal environment. They learn how to marshal, analyze, and visualize data, as well as how to communicate with the client sponsoring the project. The 2016 program involved about 70 undergraduate students and 30 graduate student and postdoctoral mentors working on 25 projects. I will describe projects that illuminate different ways in which student centered enquiry can transform university institutions.

Our ambition for research is that it impacts society. We used to think of impact as starting with an idea, then developing that idea into a prototype, then turning the prototype into a product, then marketing the product, and so on – it is a long march and the problem with long marches is that most ideas don’t make it. I will describe an example of a different model, one where society is part of the research process. It is a collaboration with Apple on child mental health led by Guillermo Sapiro.
 

Follow us: Facebook Twitter Linkedin