Quick links

CS Department Colloquium Series

Inference Attacks: Understanding Privacy in the Era of "Privacy is Dead"

Date and Time
Tuesday, February 24, 2015 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Ed Felten

Matt Fredrikson

Matt Fredrikson

As data from far-reaching sources is collected, aggregated, and re-packaged to enable new and smarter applications, confidentiality and data security are at greater risk than ever before. Some of the most surprising and invasive threats to materialize in recent years are brought about by so-called inference attacks: successful attempts to learn sensitive information by leveraging public data such as social network updates, published research articles, and web APIs.

In this talk, I will focus on two of my research efforts to better understand and defend against these attacks. First I will discuss work that examines the privacy risks that arise when machine learning models are used in a popular medical application, and illustrate the consequences of applying differential privacy as a defense. This work uncovered a new type of inference attack on machine learning models, and shows via an in-depth case study how to understand privacy "in situ" by balancing the attacker's chance of success against the likelihood of harmful medical outcomes. The second part of the talk will detail work that helps developers correctly write privacy-aware applications using verification tools. I will illustrate how a wide range of confidentiality guarantees can be framed in terms of a new logical primitive called Satisfiability Modulo Counting, and describe a tool that I have developed around this primitive that automatically finds privacy bugs in software (or proves that the software is bug-free). Through a better understanding of how proposed defenses impact real applications, and by providing tools that help developers implement the correct defense for their task, we can begin to proactively identify potential threats to privacy and take steps to ensure that they will not surface in practice.

Matt Fredrikson is a Ph.D. candidate in the department of Computer Sciences at the University of Wisconsin-Madison. His research interests lie at the intersection of security, privacy, and formal methods, covering topics in software security, privacy issues associated with machine-learning models, and applied cryptography. His work has been profiled by Reuters, Technology Review, and New Scientist, and received the best paper award at USENIX Security 2014. He is a recipient of the Microsoft Research Graduate Fellowship Award.

Privacy in the Land of Plenty

Date and Time
Thursday, February 19, 2015 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Sanjeev Arora

Privacy-preserving data analysis has a large literature that spans several disciplines.  "Differential privacy" -- a notion tailored to situations in which data are plentiful -- has provided a theoretically sound and powerful framework, and given rise to an explosion of research.  We will review the definition of differential privacy, describe some algorithmic contributions, and conclude with a surprising application.

Cynthia Dwork, Distinguished Scientist at Microsoft Research, is renowned for placing privacy-preserving data analysis on a mathematically rigorous foundation. A cornerstone of this work is differential privacy, a strong privacy guarantee frequently permitting highly accurate data analysis. Dr. Dwork has also made seminal contributions in cryptography and distributed computing, and is a recipient of the Edsger W. Dijkstra Prize, recognizing some of her earliest work establishing the pillars on which every fault-tolerant system has been built for decades. She is a member of the National Academy of Sciences and the National Academy of Engineering, and a Fellow of the American Academy of Arts and Sciences.

Coordination Avoidance in Distributed Databases

Date and Time
Thursday, February 12, 2015 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Michael Freedman
Peter Bailis

Peter Bailis

The rise of Internet-scale geo-replicated services has led to considerable upheaval in the design of modern data management systems. Namely, given the availability, latency, and throughput penalties associated with classic mechanisms such as serializable transactions, a broad class of systems (e.g., "NoSQL") has sought weaker alternatives that reduce the use of expensive coordination during system operation, often at the cost of application integrity. When can we safely forego the cost of this expensive coordination, and when must we pay the price?
 
In this talk, I will discuss the potential for coordination avoidance -- the use of as little coordination as possible while still ensuring application integrity -- in several modern data-intensive domains. Specifically, I will demonstrate how to leverage the semantic requirements of applications in data serving, transaction processing, and statistical analytics to enable more efficient distributed algorithms and system designs. The prototype systems I have built demonstrate order-of-magnitude speedups compared to their traditional, coordinated counterparts on a variety of tasks, including referential integrity and index maintenance, transaction execution under common isolation models, and asynchronous convex optimization. I will also discuss our experiences studying and optimizing a range of open source applications and systems, which exhibit similar results.
 
Peter Bailis is a Ph.D. candidate at UC Berkeley working in databases and distributed systems. As part of his dissertation work, he has studied and built high performance distributed data management systems for large scale transaction processing, data serving, and statistical analytics in the AMPLab and BOOM projects under the advisement of Joseph M. Hellerstein, Ali Ghodsi, and Ion Stoica. He is the recipient of the NSF Graduate Research Fellowship, the Berkeley Fellowship for Graduate Study, and best-of-conference citations for research appearing in both SIGMOD and VLDB. He received his A.B. in Computer Science from Harvard College in 2011, where he also received the CRA Outstanding Undergraduate Researcher Award.
 

Learning compact models from high dimensional large datasets

Date and Time
Monday, February 9, 2015 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Sebastian Seung

Yoram Singer

Yoram Singer

I review the design, analysis, and implementation of stochastic optimization techniques, online algorithms, and modeling approaches for learning in high dimensional spaces using large amounts of data. The focus is on algorithms and models that are efficient, accurate, and yield compact models. Concretely, the forward-backward shrinkage algorithm (Fobos), mirror descent for learning composite objectives (COMID), and the adaptive gradient (AdaGrad) algorithm. We also discuss simple yet effective modeling approaches based on locality for learning from high dimensional data.

Yoram Singer is a senior research scientist at Google. From 1999 through 2007 he was an associate professor at the Hebrew University of Jerusalem. From 1995 through 1999 he was a member of the technical staff at AT&T Research. He was the co-chair of the conference on Computational Learning Theory (COLT) in 2004 and of Neural Information Processing Systems (NIPS) in 2007. He serves as an editor of the Journal of Machine Learning, IEEE Signal Processing Magazine, and IEEE Trans. on Pattern Analysis and Machine Intelligence.

Building an Operating System for the Data Center

Date and Time
Monday, February 23, 2015 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Jennifer Rexford

Simon Peter

Simon Peter

Data centers run a range of important applications with ever increasing performance demands, from cloud and server computing to Big Data and eScience. However, the scaling of CPU frequency has stalled in recent years, leading to hardware architectures that no longer transparently scale software performance. Two trends stand out: 1) Instead of frequency, hardware architectures increase the number of CPU cores, leaving complex memory system performance and CPU scheduling tradeoffs exposed to software. 2) Network and storage I/O performance continues to improve, but without matching improvements in CPU frequency. Software thus faces ever increasing I/O efficiency demands.

In my research, I address how operating systems (OSes) can handle these growing complexity and performance pressures to avoid becoming the limiting factor to performance. I first explain how current OS architecture is already inadequate to address these trends, limiting application performance. I then present how the OS can be redesigned to eliminate the performance limitations without compromising on existing features or application compatibility. I finish with an outlook on how these hardware trends affect software in the future and present ideas to address them.

Simon is a postdoctoral research associate at the University of Washington, where he leads research in operating systems and networks. His postdoctoral advisors are Tom Anderson and Arvind Krishnamurthy. Simon received a Ph.D. in Computer Science from ETH Zurich in 2012 and an MSc in Computer Science from the Carl-von-Ossietzky University Oldenburg, Germany in 2006.

Simon's research focus is on data center performance issues. For his work on the Arrakis high I/O performance operating system, he received the Jay Lepreau best paper award (2014) and the Madrona prize (2014). Previously, Simon has worked on the Barrelfish multicore operating system and conducted further award-winning systems research at various locations, including MSR Silicon Valley, MSR Cambridge, Intel Labs Germany, UC Riverside, and the NSF.

Randomness Vs. Memory: A Treasure(s) Hunt

Date and Time
Wednesday, February 4, 2015 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Sanjeev Arora

Omer Reingold

Omer Reingold

Randomness plays a major role in computation, allowing the execution of tasks that would otherwise be impossible or prohibitively inefficient. Yet the relative power of randomness compared with other resources of computation is mostly unresolved. In this talk we will address the tradeoff between memory and randomness. Namely, can randomness significantly reduce the amount of memory needed for algorithmic tasks? This is one of the central problems in Computational Complexity and its study is tied with the study of pseudorandom generators that fool memory-bounded algorithms (i.e. combinatorial objects that look random to small-memory distinguishers).

There are many exciting questions on the path towards resolving the randomness vs memory problem.  In this talk we will focus on two such areas of research. First, we will discuss small memory algorithms for graph connectivity as well as pseudorandom walks, which we will relate to the general problem of randomness vs. memory. Then we will describe pseudorandom families of hash functions for data structures that are simultaneously small and efficient (with applications to load balancing, the two-choice paradigm, cuckoo hashing, and min-hashing). We will relate such families to progress on more traditional pseudorandom objects such as epsilon-bias sequences and functions with limited independence. 

Omer Reingold is a Professor of Computer Science at the Weizmann Institute of Science and a Visiting Professor at Stanford University. Past positions include Microsoft Research, the Institute for Advanced Study in Princeton, NJ, and AT&T Labs. His research is in the Foundations of Computer Science and most notably in Computational Complexity and the Foundations of Cryptography with emphasis on randomness, derandomization and explicit combinatorial constructions. He is an ACM Fellow and among his distinctions are the 2005 Grace Murray Hopper Award and the 2009 Gödel Prize.

Data Centers, Energy, and Online Optimization

Date and Time
Thursday, January 15, 2015 - 11:00am to 12:00pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Jennifer Rexford

Adam Wierman

Adam Wierman

This talk will tell two stories, one about designing sustainable data centers and one about the underlying algorithmic challenges, which fall into the context of online convex optimization.

Story 1: The typical message surrounding data centers and energy is an extremely negative one: Data centers are energy hogs. This message is pervasive in both the popular press and academia, and it certainly rings true. However, the view of data centers as energy hogs is too simplistic. One goal of this talk is to highlight that, yes, data centers use a lot of energy, but data centers can also be a huge benefit in terms of integrating renewable energy into the grid and thus play a crucial role in improving the sustainability of our energy landscape. In particular, I will highlight a powerful alternative view: data centers as demand response opportunities.

Story 2: Typically in online convex optimization it is enough to exhibit an algorithm with low (sub-linear) regret, which implies that the algorithm can match the performance of the best static solution in retrospect.  However, what if one additionally wants to maintain performance that is nearly as good as the dynamic optimal, i.e., a good competitive ratio?  In this talk, I'll highlight that it is impossible for an online algorithm to simultaneously achieve these goals.  Luckily though, in practical settings (like data centers), noisy predictions about the future are often available, and I will show that, under a general model of prediction noise, even very limited predictions about the future are enough to overcome the impossibility result.   

Adam Wierman is a Professor in the Department of Computing and Mathematical Sciences at the California Institute of Technology, where he is a founding member of the Rigorous Systems Research Group (RSRG) and maintains a popular blog called Rigor + Relevance. His research interests center around resource allocation and scheduling decisions in computer systems and services. He received the 2011 ACM SIGMETRICS Rising Star award, the 2014 IEEE Communications Society William R. Bennett Prize, and has been coauthor on papers that received of best paper awards at ACM SIGMETRICS, IEEE INFOCOM, IFIP Performance (twice), IEEE Green Computing Conference, IEEE Power & Energy Society General Meeting, and ACM GREENMETRICS.

Algorithms for Cancer Genomics

Date and Time
Monday, January 12, 2015 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Barbara Engelhardt

Ben Raphael

Ben Raphael

Understanding the molecular alterations that cause cancer is one of the greatest challenges in biology and medicine.   Recent advances in DNA sequencing technology now enable large-scale measurement of mutations in cancer genomes and provide an unprecedented opportunity to address this challenge.  However, translating this information into deeper insight about the mutational process that drives cancer demands novel computational approaches. 

In this talk, I will describe algorithms that we have developed to address key problems in cancer genomics.  First, I will discuss algorithms that deconvolve DNA sequencing data from a single tumor and determine subpopulations of cancer cells harboring different mutations.  These algorithms exploit integrality constraints on copy number aberrations and phylogenetic tree constraints that relate subpopulations.   Next, I will describe algorithms to identify combinations of mutations that perturb cellular signaling and regulatory networks.  One algorithm employs a heat diffusion process to identify subnetworks of a genome-scale interaction network that are recurrently mutated across samples.  A second algorithm finds combinations of mutations that optimize a measure of mutual exclusivity across samples.  I will illustrate applications of these approaches to multiple cancer types in The Cancer Genome Atlas (TCGA), including a recent Pan-Cancer study of >3000 samples from 12 cancer types.

Ben Raphael is an Associate Professor in the Department of Computer Science and Director of the Center for Computational Molecular Biology (CCMB) at Brown University.  His research focuses on the design of algorithms for genome sequencing and interpretation. Recent interests include structural variation in human and cancer genomes, and network/pathway analysis of genetic variants.  He received an S.B. in Mathematics from MIT, a Ph.D. in Mathematics from the University of California, San Diego (UCSD), and completed postdoctoral training in Bioinformatics and Computer Science at UCSD.  He is the recipient of an NSF CAREER award, a Career Award from the Burroughs Wellcome Fund, and a Sloan Research Fellowship.

Computational Sprinting

Date and Time
Monday, December 8, 2014 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Margaret Martonosi

Although transistor density continues to increase, power density is increasing each chip generation. Particularly in mobile devices, which have limited cooling options, these trends lead to "dark silicon" in which sustained chip performance is limited primarily by power rather than area.  However, many mobile applications do not demand sustained performance; rather they comprise short bursts of computation in response to sporadic user activity.

To improve responsiveness for such applications, this talk describes "computational sprinting", in which the system activates otherwise powered-down cores for brief bursts of intense parallel computation.  During a sprint the chip temporarily exceeds its sustainable thermal power budget to provide instantaneous throughput, after which the chip must return to nominal operation to cool down.  To enable longer sprints, we describe a thermal design that incorporates phase-change materials to provide thermal capacitance.  Results indicate that sprinting can both provide large improvements in responsiveness while actually saving energy due to race-to-idle effects.

Thomas Wenisch is an Associate Professor of Computer Science and Engineering at the University of Michigan, specializing in computer architecture. His prior research includes memory streaming for commercial server applications, multiprocessor memory systems, memory disaggregation, and rigorous sampling-based performance evaluation methodologies.  His ongoing work focuses on computational sprinting, server and data center architectures, programming models for byte-addressable NVRAM, and architectures to enable hand-held 3D ultrasound. Wenisch received the NSF CAREER award in 2009. Prior to his academic career, Wenisch was a software developer at American Power Conversion, where he worked on data center thermal topology estimation. He received his Ph.D. in Electrical and Computer Engineering from Carnegie Mellon University.

Searching and Browsing Visual Data

Date and Time
Wednesday, December 3, 2014 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Tom Funkhouser

Kristen Gruman

Widespread visual sensors and unprecedented connectivity have left us awash with visual data---from online photo collections, home videos, news footage, medical images, or surveillance feeds.  How can we efficiently browse image and video collections based on semantically meaningful criteria?  How can we bring order to the data, beyond manually defined keyword tags?  I will present work exploring these questions in the context of interactive visual search and summarization. 

In particular, I’ll first introduce attribute representations that connect visual properties to human describable terms.  I’ll show how these attributes enable both fine-grained content-based retrieval as well as new forms of human supervision for recognition problems.  Then, I’ll overview our recent work on video summarization, where the goal is to automatically transform a long video into a short one.  Using videos captured with egocentric wearable cameras, we’ll see how hours of data can be distilled to a succinct visual storyboard that is understandable in just moments.  Together, these ideas are promising steps towards widening the channel of communication between humans and computer vision algorithms, which is critical to facilitate efficient browsing of large-scale image and video collections.

This is work done with Adriana Kovashka, Yong Jae Lee, Devi Parikh, Lu Zheng, Bo Xiong, and Dinesh Jayaraman.

Kristen Grauman is an Associate Professor in the Department of Computer Science at the University of Texas at Austin.  Her research in computer vision and machine learning focuses on visual search and object recognition.  Before joining UT-Austin in 2007, she received her Ph.D. in the EECS department at MIT, in the Computer Science and Artificial Intelligence Laboratory.  She is an Alfred P. Sloan Research Fellow and Microsoft Research New Faculty Fellow, a recipient of NSF CAREER and ONR Young Investigator awards, the Regents' Outstanding Teaching Award from the University of Texas System in 2012, the PAMI Young Researcher Award in 2013, the 2013 Computers and Thought Award from the International Joint Conference on Artificial Intelligence, and a Presidential Early Career Award for Scientists and Engineers (PECASE) in 2013.  She and her collaborators were recognized with the CVPR Best Student Paper Award in 2008 for their work on hashing algorithms for large-scale image retrieval, and the Marr Best Paper Prize at ICCV in 2011 for their work on modeling relative visual attributes.

Follow us: Facebook Twitter Linkedin