CS Department Colloquium Series | Page 27 | Computer Science Department at Princeton University

Local erasure coding for data storage

Date and Time

Wednesday, October 23, 2013 - 4:30pm to 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Sergey Yekhanin, from Microsoft Research

Host

Zeev Dvir

Historically, most large distributed storage systems (e.g., Hotmail) have been using replication to provide reliability against machine failures. Today however as the amount of stored data reaches multiple Exabytes keeping few copies of data around is becoming prohibitively expensive. Therefore more and more systems are adopting erasure coding in place of replication.

Local Reconstruction Codes (LRCs) are a new class of erasure correcting codes designed specifically for applications in data storage. Built upon the rich mathematical theory of locally decodable codes developed in the theory community, LRCs provide high level of reliability and allow data fragments to be reconstructed quickly in typical failure scenarios. LRCs have been recently deployed by Windows Azure Storage and are going to ship in Windows 8.1 and Windows Server 2012.

In this talk we will discuss motivation behind local reconstruction codes and cover the main technical challenges and tradeoffs in the design of these codes.

(Based on joint papers with Brad Calder, Michael Forbes, Parikshit Gopalan, Cheng Huang, Bob Jenkins, Jin Li, Aaron Ogus, Huseyin Simitci, and Yikang Xu.)

Medical Device Cyber Security: The First 164 Years

Date and Time

Thursday, October 10, 2013 - 4:30pm to 5:30pm

Location

Friend Center 006

Type

CS Department Colloquium Series

Speaker

Kevin Fu, from University of Michigan

Host

Michael Freedman

The U.S. Institute of Medicine commissioned my 2011 report on the role of trustworthy software in the context of the "510(k)" U.S. medical device regulations. More recently, the U.S. Food and Drug Administration has released draft guidance on cybersecurity for medical device manufacturing. This talk will provide a glimpse into the risks, benefits, and regulatory issues for medical device cybersecurity and innovation of trustworthy medical device software.

Today, it would be difficult to find medical device technology that does not critically depend on computer software. The technology enables patients to lead more normal and healthy lives. However, medical devices that rely on software (e.g., drug infusion pumps, linear accelerators) continue to injure or kill patients in preventable ways---despite the lessons learned from the tragic radiation incidents of the Therac-25 era. The lack of trustworthy medical device software leads to shortfalls in properties such as safety, effectiveness, dependability, reliability, usability, security, and privacy.

Come learn a bit about the science, technology, and policy that shapes medical device software.

Kevin Fu is an Associate Professor of Electrical Engineering and Computer Science at the University of Michigan where he directs the Archimedes Center for Medical Device Security and the SPQR group. His research investigates how to achieve trustworthy computing on embedded devices with application to health care, commerce, and communication. His most recent contributions appear in computer science and medical conferences and journals such as USENIX Security and IEEE Security and Privacy.

Kevin received his PhD in EECS from the MIT. Fu received a Sloan Research Fellowship, NSF CAREER award, Fed100 Award, and best paper awards from various academic silos of computing. The research is featured in critical articles by the NYT, WSJ, and NPR. Kevin was named MIT Technology Review TR35 Innovator of the Year for work on medical device security. Kevin has testified in Congress on health matters and has written commissioned work for the Institute of Medicine of the National Academies. He served as a visiting scientist at the Food & Drug Administration, the Beth Israel Deaconess Medical Center of Harvard Medical School, Microsoft Research, and MIT CSAIL. Previous employers include Bellcore, Cisco Systems, HP Labs, and Holland Community Hospital. He is a member of the ACM Committee on Computers and Public Policy and the NIST Information Security and Privacy Advisory Board. Kevin also holds a certificate of achievement in artisanal bread making from the French Culinary Institute.

From Potential to Promise - Developing Scholars, one Eureka moment at a time

Date and Time

Wednesday, October 2, 2013 - 4:30pm to 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Rajiv Gandhi, from Rutgers University, Camden

Host

Moses Charikar

In this talk, I will tell the story of our work with some truly remarkable undergraduate students at Rutgers-Camden, who despite many odds have achieved success that is unprecedented for the Camden campus. I will discuss the various challenges that we faced and some ideas that have worked very well (and some that have not) for us. We have been applying some of these ideas in our work with high school students and students at other institutions. This talk should be of interest to all faculty and students.

Dr. Rajiv Gandhi is an Associate Professor of Computer Science at the Rutgers University-Camden. He received his Ph.D. in Computer Science from the University of Maryland, College Park in 2003. He worked as a software engineer at Qualcomm from 1994-96. His research interests lie in the broad area of theoretical computer science. Specifically, he is interested in approximation and randomized algorithms, distributed algorithms, and graph theory. He has published papers in these areas in leading journals and conferences. He has been the recipient of several teaching excellence awards -- at Rutgers-Camden and at other universities. He was also the recipient of the Chancellor's award for Civic Engagement at Rutgers-Camden in 2013. He was a Fulbright Fellow from Jan-June 2011, during which he worked with students in Mumbai. Since 2009, he has also been working with high school students as part of the Program in Discrete Mathematics and Computer Science.

How to Encrypt Software

Date and Time

Wednesday, September 25, 2013 - 4:30pm to 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Amit Sahai, from UCLA

Host

Sanjeev Arora

The goal of secure program obfuscation is to make a program “unintelligibleâ€ while preserving its functionality. For decades, program obfuscation for general programs has remained an art, with all public general-purpose obfuscation methods known to be broken.

In this talk, we will describe new developments this year that for the first time provide a mathematical approach to the problem of general-purpose program obfuscation, where extracting secrets from the obfuscated program requires solving mathematical problems that currently take hundreds of years to solve on the world's fastest computing systems. We will also discuss the implications of these developments.

This talk is based on joint works with Sanjam Garg, Craig Gentry, Shai Halevi, Mariana Raykova, and Brent Waters.

Professor Amit Sahai received his Ph.D. in Computer Science from MIT in 2000. From 2000 to 2004, he was on the faculty at Princeton University; in 2004 he joined UCLA, where he currently holds the position of Professor of Computer Science. His research interests are in security and cryptography, and theoretical computer science more broadly. He has published more than 100 original technical research papers at venues such as the ACM Symposium on Theory of Computing (STOC), CRYPTO, and the Journal of the ACM. He has given a number of invited talks at institutions such as MIT, Stanford, and Berkeley, including the 2004 Distinguished Cryptographer Lecture Series at NTT Labs, Japan. Professor Sahai is the recipient of numerous honors; he was named an Alfred P. Sloan Foundation Research Fellow in 2002, received an Okawa Research Grant Award in 2007, a Xerox Foundation Faculty Award in 2010, a Google Faculty Research Award in 2010, and the 2013 Pazy Memorial Award.

The Knowledge of Preconditions principle

Date and Time

Monday, September 9, 2013 - 4:30pm to 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Yoram Moses, from Technion

This talk will present a simple principle relating knowledge and action in distributed systems: If some condition must be true when a given action is taken, then this condition must be known when the action is taken. This yields a number of fundamental connections between knowledge and multi-party coordination. In particular, the talk will illustrate how these can provide insight into the interplay between time, communication and coordination in networks and distributed systems. The talk will be self-contained, intended for a general CS audience.

The latter part of the talk is based on joint work with Ido Ben Zvi. No familiarity with the subject will be assumed.

Yoram Moses is the Israel Pollack Academic Chair at the Technion. He is a recipient of the Godel Prize 1997 and the Dijkstra Award in 2009.

Design Mining the Web

Date and Time

Monday, April 15, 2013 - 4:30pm to 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Ranjitha Kumar, from Stanford University

Host

Rebecca Fiebrink

The Web has transformed the nature of creative work. For the first time, millions of people have a direct outlet for sharing their creations with the world. As a result, the Web has become the largest repository of design knowledge in human history, and the ensuing "democratization of design" has created a critical feedback loop, engendering a new culture of reuse and remixing.

The means and methods designers employ to draw on prior work, however, remain mostly informal and ad hoc. How can content producers find relevant examples amongst hundreds of millions of possibilities and leverage existing design practice to inform and improve their creations? My research explores data-driven techniques for working with examples at scale during the design process, automating search and curation, enabling rapid retargeting, and learning generative probabilistic models to support new design interactions. Knowledge discovery and data mining have revolutionized informatics; in this talk, I'll discuss what we can learn from mining design.

Ranjitha Kumar is a PhD candidate in the Computer Science Department at Stanford University, where she builds principled, data-driven tools for amplifying human creativity in design. Her work has received best paper awards or nominations at both of the premiere HCI conferences (CHI and UIST), and been recognized by the machine learning community through invited papers at IJCAI and ICML. She is the recipient of the 2011 Google PhD Fellowship in Design Development, and holds a BS in Computer Science from Stanford.

Learning in an Adversarial World, with Connections to Pricing, Hedging, and Routing

Date and Time

Tuesday, April 2, 2013 - 4:30pm to 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Jacob Duncan Abernethy, from University of Pennsylvania

Host

Robert Schapire

Machine Learning is often viewed through the lens of statistics, where one tries to model or fit a set of data under stochastic conditions; for example, it is typical to assume one's observations were sampled IID. However, dating back to results of Blackwell and Hannan from the 1950s we know how to construct learning and decision strategies that possess robust guarantees even under adversarial conditions. Within this setting the goal of the learner is to "minimize regret" against any sequence of inputs. In this talk we lay out the framework, discuss some recent results, and we finish by exploring a couple of surprising applications and connections, including: (a) market making in combinatorial prediction markets, (b) routing with limited feedback, and (c) hedging derivative securities (e.g. European option contracts) in the worst-case, with a connection to the classical Black-Scholes option-pricing model.

Jake received his undergraduate degree in Mathematics from MIT in 2002 and a Master's degree in Computer Science from TTI-C in 2006. He recently finished a PhD in Computer Science at UC Berkeley, advised by Peter Bartlett, and he is now the Simons Postdoctoral Fellow at University of Pennsylvania. Jake has a particular focus on the intersection between machine learning, games and markets.

Making Big Data Analytics Interactive and Real-Time

Date and Time

Thursday, April 11, 2013 - 4:30pm to 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Matei Zaharia, from University of California, Berkeley

Host

Michael Freedman

The rapid growth in data volumes requires new computer systems that scale out across hundreds of machines. While early frameworks, such as MapReduce, handled large-scale batch processing, the demands on these systems have also grown. Users quickly needed to run (1) more interactive ad-hoc queries, (2) more complex multi-pass algorithms (e.g. machine learning and graph processing), and (3) real-time processing on large data streams. In this talk, we present a single abstraction, resilient distributed datasets (RDDs), that supports all of these emerging workloads by providing efficient and fault-tolerant in-memory data sharing. We have used RDDs to build a stack of computing systems including the Spark parallel engine, Shark SQL processor, and Spark Streaming engine. Spark and Shark can run machine learning algorithms and interactive queries up to 100x faster than Hadoop MapReduce, while Spark Streaming enables fault-tolerant stream processing at significantly higher scales than were possible before. These systems, along with several resource allocation and scheduling algorithms we have developed along the way, have been used in multiple industry and research applications, and have a growing open source community with 14 companies contributing in the past year.

Matei Zaharia is a PhD student at UC Berkeley, working with Scott Shenker and Ion Stoica on topics in systems, cloud computing and networking. He is also a committer on Apache Mesos and Apache Hadoop. His work is supported by a Google PhD fellowship. Matei got his undergraduate degree at the University of Waterloo in Canada.

Programming and Proving in Homotopy Type Theory

Date and Time

Monday, March 25, 2013 - 4:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Daniel Licata, from Carnegie Mellon University & Institute for Advanced Study

Host

David Walker

There is a strong correspondence between mathematics and programming, under which mathematical proofs correspond to programs. A _proof assistant_ is a tool that a mathematician/programmer can use to represent proofs/programs, in such a way that a computer can verify whether or not they are correct. The use of proof assistants in math and computer science is becoming ever more important for managing the increasing complexity of proofs and programs. Proof assistants have been used to check significant mathematical theorems, such as the Four Color Theorem and the Feit-Thompson Odd-Order Theorem, as well as large pieces of software, such as a C compiler and the definition of the Standard ML programming language.

In this talk, I will describe my work on Homotopy Type Theory, which is a new foundation for proof assistants. Homotopy Type Theory extends type theory, the basis of several successful proof assistants, with new principles that bring it closer to informal mathematical practice. Moreover, it is based on an exciting new correspondence between type theory and the mathematical disciplines of homotopy theory and category theory, and consequently can be used to directly describe structures in these areas of math. I will describe computer-checked proofs of some basic results in algebraic topology, including calculations of homotopy groups of spheres and Eilenberg-MacLane spaces. Homotopy Type Theory is also a programming language, in which the new mathematical principles correspond to new programming techniques, which make certain programs easier to write. I will also describe my work on using Homotopy Type Theory as a programming language and its applications in computer science.

Dan Licata received his PhD from Carnegie Mellon University in 2011, advised by Robert Harper. His dissertation, Dependently Typed Programming with Domain-Specific Logics, won the FoLLI E.W. Beth Dissertation Award in 2012. After graduating, he spent three semesters as a teaching fellow at Carnegie Mellon, designing and delivering a new introductory course on functional programming, parallelism, and verification. This year, he is a post-doctoral member at the Institute for Advanced Study, which is hosting a year-long program on an exciting new connection between programming languages and mathematics. His research focuses on creating better tools for computer-checked mathematics and software verification.

Fast learning algorithms for discovering the hidden structure in data

Date and Time

Thursday, March 28, 2013 - 4:30pm to 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Daniel Hsu, from Microsoft Research New England

Host

Robert Schapire

A major challenge in machine learning is to reliably and automatically discover hidden structure in data with minimal human intervention. For instance, one may be interested in understanding the stratification of a population into subgroups, the thematic make-up of a collection of documents, or the dynamical process governing a complex time series. Many of the core statistical estimation problems for these applications are, in general, provably intractable for both computational and statistical reasons; and therefore progress is made by shifting the focus to realistic instances that rule out the intractable cases. In this talk, I'll describe a general computational approach for correctly estimating a wide class of statistical models, including Gaussian mixture models, Hidden Markov models, Latent Dirichlet Allocation, Probabilistic Context Free Grammars, and several more. The key idea is to exploit the structure of low-order correlations that is present in high-dimensional data. The scope of the new approach extends beyond the purview of previous algorithms; and it leads to both new theoretical guarantees for unsupervised machine learning, as well as fast and practical algorithms for large-scale data analysis.

Daniel Hsu is a postdoc at Microsoft Research New England. Previously, he was a postdoc with the Department of Statistics at Rutgers University and the Department of Statistics at the University of Pennsylvania from 2010 to 2011, supervised by Tong Zhang and Sham M. Kakade. He received his Ph.D. in Computer Science in 2010 from UC San Diego, where he was advised by Sanjoy Dasgupta; and his B.S. in Computer Science and Engineering in 2004 from UC Berkeley. His research interests are in algorithmic statistics and machine learning.