Quick links

CS Department Colloquium Series

Computational Social Science: Exciting Progress and Future Challenges

Date and Time
Tuesday, December 12, 2017 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series

Duncan Watts
The past 15 years have witnessed a remarkable increase in both the scale and scope of social and behavioral data available to researchers, leading some to herald the emergence of a new field: “computational social science.” In this talk I highlight two areas of research that would not have been possible just a handful of years ago: first, using “big data” to study social contagion on networks; and second, using virtual labs to extend the scale, duration, and complexity of traditional lab experiments. Although these examples were all motivated by substantive problems of longstanding interest to social science, they also illustrate how new classes of data can cast these problems in new light. At the same, they illustrate some important limitations faced by our existing data generating platforms. I then conclude with some thoughts on how CSS might overcome some of these obstacles to progress.

Bio: Duncan Watts is a principal researcher at Microsoft Research and a founding member of the MSR-NYC lab. He is also an AD White Professor at Large at Cornell University. Prior to joining MSR in 2012, he was from 2000-2007 a professor of Sociology at Columbia University, and then a principal research scientist at Yahoo! Research, where he directed the Human Social Dynamics group. His research on social networks and collective dynamics has appeared in a wide range of journals, from Nature, Science, and Physical Review Letters to the American Journal of Sociology and Harvard Business Review, and has been recognized by the 2009 German Physical Society Young Scientist Award for Socio and Econophysics, the 2013 Lagrange-CRT Foundation Prize for Complexity Science, and the 2014 Everett Rogers M. Rogers Award. He is also the author of three books: Six Degrees: The Science of a Connected Age (W.W. Norton, 2003) and Small Worlds: The Dynamics of Networks between Order and Randomness (Princeton University Press, 1999), and most recently Everything is Obvious: Once You Know The Answer (Crown Business, 2011). Watts holds a B.Sc. in Physics from the Australian Defense Force Academy, from which he also received his officer’s commission in the Royal Australian Navy, and a Ph.D. in Theoretical and Applied Mechanics from Cornell University.

The blessing and the curse of the multiplicative updates - discusses connections between in evolution and the multiplicative updates of online learning

Date and Time
Monday, May 22, 2017 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Prof. Elad Hazan

Multiplicative updates multiply the parameters by nonnegative factors. These updates are motivated by a Maximum Entropy Principle and  they are prevalent in evolutionary processes where the parameters  are for example concentrations of species and the factors are survival rates. The simplest such update is Bayes rule and we give an in vitro selection algorithm for RNA strands that implements this rule in the test tube where each RNA strand represents a different model.  In one liter of the RNA soup there are approximately 10^15 different strands and therefore this is a rather high-dimensional implementation of Bayes rule.

We investigate multiplicative updates for the purpose of learning online while processing a stream of examples. The ``blessing'' of these updates is that they learn very fast in the short term because the good parameters grow exponentially. However their ``curse'' is that they learn too fast and  wipe out parameters too quickly. This can have a negative effect in the long term. We describe a number of methods developed in the realm of online learning that ameliorate the curse of the multiplicative updates. The methods make the algorithm robust against data that changes over time and prevent the currently good parameters from taking over. We also discuss how the curse is circumvented by nature. Surprisingly, some of nature's methods parallel the ones developed in Machine Learning, but nature also has some additional tricks.

This will be a high level talk.
No background in online learning will be required.
We will give a number of open problems and discuss how these updates are applied for training feed forward neural nets.

Algorithm Design Using Polynomials

Date and Time
Tuesday, May 9, 2017 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Prof. Sanjeev Arora

Photo of Nima Anari
In this talk I will present new methods for algorithm design and showcase results and breakthroughs obtained by looking through the lens of polynomials. I will start with the best known approximation algorithm for estimating the solution of the asymmetric traveling salesman problem, one of the oldest problems in computer science and optimization. Then, I will discuss a general framework based on computing inner products of polynomials that is used to design approximation algorithms for several important computational problems: volume maximization, counting matchings in bipartite graphs, Nash welfare maximization, and computing the permanent of positive semidefinite matrices.

Nima Anari is a postdoctoral researcher at Stanford University working with Amin Saberi. His research interests are in the design and analysis of algorithms, with a recent focus on applications of polynomials in this endeavor. He recently completed his Ph.D. in computer science at UC Berkeley, where he was advised by Satish Rao and was part of the theory group. Prior to that he received his B.Sc. in computer engineering and mathematics from Sharif university of technology.

Fast statistical methods for big data genetics

Date and Time
Tuesday, April 25, 2017 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Prof. Mona Singh

Photo of Po-Ru Loh
As genomic data sizes have grown exponentially over the past decade, efficient statistical analysis has become a key challenge in quantitative genetics. I will describe three ongoing research thrusts in the fields of mixed model analysis, haplotype phasing and imputation, and somatic structural variant detection, each taking place at the intersection of mathematics, computer science, and genetics.

I am currently a Postdoctoral Research Associate in Statistical Genetics at the Harvard T.H. Chan School of Public Health working with Dr. Alkes Price. The broad aim of my research is to develop efficient computational algorithms to analyze very large genetic data sets. My current areas of focus include (1) fast algorithms for genotype phasing and imputation, (2) linear mixed model methods for heritability analysis, association testing, and risk prediction, and (3) computational phase-based detection of pre-cancerous mosaic chromosomal aberrations. Several of my research projects have resulted in software packages now in use by the genetics community.

Prior to joining the Harvard Chan School in 2013,  I received my Ph.D. in Applied Mathematics from the Massachusetts Institute of Technology and my B.S. in Mathematics from the California Institute of Technology.

Smart redundancy for big-data systems: Theory and Practice

Date and Time
Thursday, April 13, 2017 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Prof. Kyle Jamieson and Prof. Naveen Verma, Departments of Computer Science and Electrical Engineering

Large-scale distributed storage and caching systems form the foundation of big-data systems. A key scalability challenge in distributed storage systems is achieving fault tolerance in a resource-efficient manner. Towards addressing this challenge, erasure codes provide a storage-efficient alternative to the traditional approach of data replication. However, classical erasure codes come with critical drawbacks: while optimal in utilizing storage space, they significantly increase the usage of other important cluster resources such as network and I/O. In the first part of the talk, I present new erasure codes and theoretical optimality guarantees. The proposed codes reduce the network and I/O usage by 35-70% for typical parameters while retaining the storage efficiency of classical codes. I then present an erasure-coded storage system that employs the proposed codes, and demonstrate significant benefits over the state-of-the-art in evaluations under production setting at Facebook. Our codes have been integrated into Apache Hadoop 3.0. The second part of the talk focuses on achieving high performance in distributed caching systems. These systems routinely face the challenges of skew in data popularity, background traffic imbalance, and server failures, which result in load imbalance across servers and degradation in read latencies. I present EC-Cache, a cluster cache that employs erasure coding to achieve a 3-5x improvement as compared to the state-of-the-art.

Rashmi K. Vinayak received her PhD in the EECS department at UC Berkeley in 2016, where she is now a postdoctoral researcher at AMPLab/RISELab and BLISS working with Ion Stoica and Kannan Ramchandran. Her dissertation received the Eli Jury Award 2016 from the EECS department at UC Berkeley for outstanding achievement in the area of systems, communications, control, or signal processing. Rashmi is the recipient of the IEEE Data Storage Best Paper and Best Student Paper Awards for the years 2011/2012. She is also a recipient of the Facebook Fellowship 2012-13, the Microsoft Research PhD Fellowship 2013-15, and the Google Anita Borg Memorial Scholarship 2015-16. Her research interests lie in the theoretical and system challenges that arise in storage and analysis of big data.

Wearable and Wireless Cyber-Physical Systems for Non-invasive Sleep Quality Monitoring

Date and Time
Thursday, April 6, 2017 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Prof. Jennifer Rexford

Sleep occupies nearly a third of human’s life and acts as a critical daily function, helping our body to balance and regulate vital systems. However, in the United States, more than 35% of the population is sleep deprived. Therefore, quantifying sleep quality is important and has significant clinical value in detecting and diagnosing various sleep-related disorders. Unfortunately, the current “gold standard” for studying patients’ sleep is obtrusive, expensive, and often inaccurate.

In this talk, I will introduce our wearable and radio-based sensing systems that promise unobtrusive, low-cost, and accurate sleep study for in-hospital and in-home settings. I will start with WiSpiro, a sensing system that is able to unobtrusively monitor breathing volume and detect sleep disorder breathing in patients using radio signals from afar. I will then discuss LIBS, an in-ear wearable sensing system that can simultaneously monitor human’s brain activities, eye movement, and facial movement, which are critical for fine-grained sleep stage monitoring.  I will also identify other potential uses of these systems in a broader context of health care, such as monitoring eating habits and disorders, detecting autism at early stages, improving neurological surgery practice, and detecting seizure. I will conclude the talk by discussing my on-going research as well as my future directions to improve current health care practices through the development of other innovative cyber-physical healthcare systems.

Tam Vu is directing Mobile and Networked Systems (MNS) Lab at University of Colorado Boulder, where he and his team work on building systems to improve pediatric health care practices. At MNS, he designs and implements novel and practical cyber-physical systems to make physiological sensing (e.g. breathing volume measurement, brainwave signal monitoring, muscle movement recording, and sleep quality monitoring) less obtrusive at lower cost.

Tam Vu’s research contribution has been recognized with four best paper awards from ACM SenSys 2016, MobiCom S3 2016, MobiCom 2012, and MobiCom 2011; a Google Faculty Research Award in 2014; and wide press coverage including Denver Post, CNN TV, NY Times, The Wall Street Journal, National Public Radio (NPR), MIT Technology Review, Yahoo News. He is actively pushing his research outcome to practice through technology transfer activities with 9 patents filed in the last 3 years and forming 2 startups to commercialize them. He received his Ph.D. in Computer Science from WINLAB, Rutgers University in 2013 and a BS in Computer Science from Hanoi University of Technology in 2006.

Efficient Methods and Hardware for Deep Learning

Date and Time
Monday, April 3, 2017 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Prof. Kai Li and Prof. David Wentzlaff, Computer Science and Electrical Engineering Departments

Deep learning has spawned a wide range of AI applications that are changing our lives. However, deep neural networks are both computationally and memory intensive, thus they are power hungry when deployed on embedded systems and data centers with a limited power budget. To address this problem, I will present an algorithm and hardware co-design methodology for improving the efficiency of deep learning.

I will first introduce "Deep Compression", which can compress deep neural network models by 10–49× without loss of prediction accuracy for a broad range of CNN, RNN, and LSTMs. The compression reduces both computation and storage. Next, by changing the hardware architecture and efficiently implementing Deep Compression, I will introduce EIE, the Efficient Inference Engine, which can perform decompression and inference simultaneously, saving a significant amount of memory bandwidth. By taking advantage of the compressed model and being able to deal with an irregular computation pattern efficiently, EIE achieves 13× speedup and 3000× better energy efficiency over GPU. Finally, I will revisit the inefficiencies in current learning algorithms, present DSD training, and discuss the challenges and future work in efficient methods and hardware for deep learning.

Song Han is a Ph.D. candidate supervised by Prof. Bill Dally at Stanford University. His research focuses on energy-efficient deep learning, at the intersection between machine learning and computer architecture. He proposed the Deep Compression algorithm, which can compress neural networks by 10–49× while fully preserving prediction accuracy. He designed the first hardware accelerator that can perform inference directly on a compressed sparse model, which results in significant speedup and energy saving. His work has been featured by O’Reilly, TechEmergence, TheNextPlatform, and Embedded Vision, and it has impacted the industry. He led research efforts in model compression and hardware acceleration that won the Best Paper Award at ICLR’16 and the Best Paper Award at FPGA’17. Before joining Stanford, Song graduated from Tsinghua University.

Natural Language Understanding with Paraphrases and Composition

Date and Time
Monday, April 3, 2017 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Prof. Elad Hazan

Natural language processing (NLP) aims to teach computers to understand human language. NLP has enabled some of the most visible applications of artificial intelligence, including Google search, IBM Watson, and Apple’s Siri. As AI is applied to increasingly complex domains such as health care, education, and government, NLP will play a crucial role in allowing computational systems to access the vast amount of human knowledge documented in the form of unstructured speech and text.

In this talk, I will discuss my work on training computers to make inferences about what is true or false based on information expressed in natural language. My approach combines machine learning with insights from formal linguistics in order to build data-driven models of semantics which are more precise and interpretable than would be possible using linguistically naive approaches. I will begin with my work on automatically adding semantic annotations to the 100 million phrase pairs in the Paraphrase Database (PPDB). These annotations provide the type of information necessary for carrying out precise inferences in natural language, transforming the database into a largest available lexical semantics resource for natural language processing. I will then turn to the problem of compositional entailment, and present an algorithm for performing inferences about long phrases which are unlikely to have been observed in data. Finally, I will discuss my current work on pragmatic reasoning: when and how humans derive meaning from a sentence beyond what is literally contained in the words. I will describe the difficulties that such "common-sense" inference poses for automatic language understanding, and present my on-going work on models for overcoming these challenges.

Ellie Pavlick is a PhD student at the University of Pennsylvania, advised by Dr. Chris Callison-Burch. Her dissertation focuses on natural language inference and entailment. Outside of her dissertation research, Ellie has published work on stylistic variation in paraphrase--e.g. how paraphrases can effect the formality or the complexity of language--and on applications of crowdsourcing to natural language processing and social science problems. She has been involved in the design and instruction of Penn's first undergraduate course on Crowdsourcing and Human Computation (NETS 213). Ellie is a 2016 Facebook PhD Fellow, and has interned at Google Research, Yahoo Labs, and the Allen Institute for Artificial Intelligence. 

Robots in Clutter: Learning to Understand Environmental Changes

Date and Time
Thursday, March 30, 2017 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Prof. Thomas Funkhouser

Photo of David Held
Robots today are confined to operate in relatively simple, controlled environments.   One reason for this is that current methods for processing visual data tend to break down when faced with occlusions, viewpoint changes, poor lighting, and other challenging but common situations that occur when robots are placed in the real world.  I will show that we can train robots to handle these variations by modeling the causes behind visual appearance changes.  If robots can learn how the world changes over time, they can be robust to the types of changes that objects often undergo.  I demonstrate this idea in the context of autonomous driving, and I will show how we can use this idea to improve performance for every step of the robotic perception pipeline: object segmentation, tracking, and velocity estimation.  I will also present some recent work on learning to manipulate objects, using a similar framework of learning environmental changes.  By learning how the environment can change over time, we can enable robots to operate in the complex, cluttered environments of our daily lives.

David Held is a post-doctoral researcher at U.C. Berkeley working with Pieter Abbeel on deep reinforcement learning for robotics.  He recently completed his Ph.D. in Computer Science at Stanford University with Sebastian Thrun and Silvio Savarese, where he developed methods for perception for autonomous vehicles.  David has also worked as an intern on Google’s self-driving car team.  Before Stanford, David was a researcher at the Weizmann Institute, where he worked on building a robotic octopus.  He received a B.S. and M.S. in Mechanical Engineering at MIT and an M.S. in Computer Science at Stanford, for which he was awarded the Best Master's Thesis Award from the Computer Science Department.

Computational Design and Fabrication to Augment the Physical World

Date and Time
Thursday, March 9, 2017 - 12:30pm to 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Prof. Szymon Rusinkiewicz

Xiang 'Anthony' Chen
We are looking towards a future where a wide range of fabrication machines--from desktop 3D printers to industrial robotic arms--will allow end-users to create physical objects of their own designs. I develop computational design tools that enable end-users to express and convey their ideas into fabrication-ready 3D models. For example, Encore lets users print new parts directly onto, around or through existing objects; Reprise helps users generate and customize adaptations to mechanically enhance existing objects; Façade creates Braille overlays from user-taken photos to make appliances visually accessible; Forté turns a user's design into structures that can robustly support existing objects. In all these tools, the ultimate goal of fabrication is to augment the physical world, extending, adapting, annotating, or supporting existing objects to improve the quality of users' everyday lives.


Xiang 'Anthony' Chen is a PhD student working with Scott Hudson and Stelian Coros from the School of Computer Science, Carnegie Mellon University. His research develops technical and design approaches to build novel human-computer interfaces that enhance users' physical interactivity with ubiquitous computers, or enable their creativity in fabricating physical objects of their design (e.g., using 3D printing). Anthony is an Adobe Research Fellow in Human-Computer Interaction. Frequently collaborating with industrial research labs (Microsoft, Autodesk, and Google), he has published 13 papers in top-tier HCI conferences and journal (CHI, UIST, and TOCHI) and has received two best paper awards. 

Follow us: Facebook Twitter Linkedin