Quick links

CS Department Colloquium Series

Searching and Browsing Visual Data

Date and Time
Wednesday, December 3, 2014 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Tom Funkhouser

Kristen Gruman

Widespread visual sensors and unprecedented connectivity have left us awash with visual data---from online photo collections, home videos, news footage, medical images, or surveillance feeds.  How can we efficiently browse image and video collections based on semantically meaningful criteria?  How can we bring order to the data, beyond manually defined keyword tags?  I will present work exploring these questions in the context of interactive visual search and summarization. 

In particular, I’ll first introduce attribute representations that connect visual properties to human describable terms.  I’ll show how these attributes enable both fine-grained content-based retrieval as well as new forms of human supervision for recognition problems.  Then, I’ll overview our recent work on video summarization, where the goal is to automatically transform a long video into a short one.  Using videos captured with egocentric wearable cameras, we’ll see how hours of data can be distilled to a succinct visual storyboard that is understandable in just moments.  Together, these ideas are promising steps towards widening the channel of communication between humans and computer vision algorithms, which is critical to facilitate efficient browsing of large-scale image and video collections.

This is work done with Adriana Kovashka, Yong Jae Lee, Devi Parikh, Lu Zheng, Bo Xiong, and Dinesh Jayaraman.

Kristen Grauman is an Associate Professor in the Department of Computer Science at the University of Texas at Austin.  Her research in computer vision and machine learning focuses on visual search and object recognition.  Before joining UT-Austin in 2007, she received her Ph.D. in the EECS department at MIT, in the Computer Science and Artificial Intelligence Laboratory.  She is an Alfred P. Sloan Research Fellow and Microsoft Research New Faculty Fellow, a recipient of NSF CAREER and ONR Young Investigator awards, the Regents' Outstanding Teaching Award from the University of Texas System in 2012, the PAMI Young Researcher Award in 2013, the 2013 Computers and Thought Award from the International Joint Conference on Artificial Intelligence, and a Presidential Early Career Award for Scientists and Engineers (PECASE) in 2013.  She and her collaborators were recognized with the CVPR Best Student Paper Award in 2008 for their work on hashing algorithms for large-scale image retrieval, and the Marr Best Paper Prize at ICCV in 2011 for their work on modeling relative visual attributes.

Some Algorithmic Challenges in Statistics

Date and Time
Monday, December 1, 2014 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series

Sham Kakade

Machine learning is seeing tremendous progress in its impact on society. Along with this progress comes an increasing role for both scalable algorithms and theoretical foundations; the hope being that the such progress can facilitate further breakthroughs on core AI problems. This will talk will survey recent progress and future challenges at the intersection of computer science and statistics, with a focus on three areas:
 
How can we learn the interactions between observed variables, where there exist certain latent (or hidden) causes which help to explain the correlations in the observed data. Such settings where latent variable models have seen successes include document (or topic) modeling, hidden Markov models (say for modeling time series of acoustic signals), and discovering communities of individuals in social networks.
 
The second is that of stochastic optimization. Many problems that arise in science and engineering are those in which we only have a stochastic approximation to the underlying problem at hand (e.g. linear regression or other problems where our objective function is a sample average). Such problems highlight some of the challenges we face at the interface of computer science and statistics: should we use a highly (numerically) accurate algorithm (with costly time and space requirements) or a crude stochastic approximation scheme like stochastic gradient descent (which is light on memory and simple to implement, yet has a poor convergence rate)?

Finally, I will provide a brief discussion with regards to future challenges inspired by the impressive successes of deep learning.

 
A recurring theme is that algorithmic advances can provide new practical techniques for statistical estimation.
 
Sham Kakade is a principal research scientist scientist at Microsoft Research, New England. His research focus is on designing scalable and efficient algorithms for machine learning and artificial intelligence; he has worked (and has continued interests) in areas such as statistics, optimization, probability theory, algorithms, economics, and neuroscience. Previously, Dr. Kakade was an associate professor at the Department of Statistics, Wharton, University of Pennsylvania (from 2010-2012) and was an assistant professor at the Toyota Technological Institute at Chicago. Before this, he did a postdoc in the Computer and Information Science department at the University of Pennsylvania under the supervision of Michael Kearns. Dr. Kakade completed his PhD at the Gatsby Unit where his advisor was Peter Dayan. Before Gatsby, Dr. Kakade was an undergraduate at Caltech where he did his BS in physics.

Interactive ML for People: The Small Data Problem

Date and Time
Wednesday, November 19, 2014 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Barbara Engelhardt

 

Emma Brunskill

Consider an intelligent tutoring system or an autonomous decision support tool for a doctor. Though such systems may in aggregate have a huge amount of data, the data collected for a single individual is typically very small, and the policy space (of what to next teach a student or how to help treat a patient) is enormous.

I will describe two machine learning efforts to tackle these small data challenges: learning across multiple tasks, and better use of previously collected task data, where tasks in both cases involve
sequential stochastic decision processes (reinforcement learning and bandits). I will also present results of how one of these techniques allowed us to substantially increase engagement in an educational game
to teach fractions.

Emma Brunskill is an assistant professor in the computer science department at Carnegie Mellon University. She is also affiliated with the machine learning department at CMU. She works on reinforcement learning, focusing on applications that involve artificial agents interacting with people, such as intelligent tutoring systems. She is a Rhodes Scholar, Microsoft Faculty Fellow and NSF CAREER award recipient, and her work has received best paper nominations in Education Data Mining (2012, 2013) and CHI (2014).

Flexible, Reliable, and Scalable Nonparametric Learning

Date and Time
Monday, November 24, 2014 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Barbara Engelhardt

Erik Sudderth

Erik Sudderth

Applications of statistical machine learning increasingly involve datasets with rich hierarchical, temporal, spatial, or relational structure.  Bayesian nonparametric models offer the promise of effective learning from big datasets, but standard inference algorithms often fail in subtle and hard-to-diagnose ways.  We explore this issue via variants of a popular and general model family, the hierarchical Dirichlet process.  We propose a framework for "memoized" online optimization of variational learning objectives, which achieves computational scalability by processing local batches of data, while simultaneously adapting the global model structure in a coherent fashion.  Using this approach, we build improved models of text, audio, image, and social network data.

Erik B. Sudderth is an Assistant Professor in the Brown University Department of Computer Science.  He received the Bachelor's degree (summa cum laude, 1999) in Electrical Engineering from the University of California, San Diego, and the Master's and Ph.D. degrees (2006) in EECS from the Massachusetts Institute of Technology.  His research interests include probabilistic graphical models; nonparametric Bayesian methods; and applications of statistical machine learning in computer vision and the sciences.  He received an NSF CAREER award in 2014, and in 2008 was named one of "AI's 10 to Watch" by IEEE Intelligent Systems Magazine.

Better Science Through Better Bayesian Computation

Date and Time
Thursday, November 13, 2014 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Barbara Engelhardt

Ryan Adams

Ryan Adams

As we grapple with the hype of "big data" in computer science, it is important to remember that the data are not the central objects: we collect data to answer questions and inform decisions in science, engineering, policy, and beyond.  In this talk, I will discuss my work in developing tools for large-scale data analysis, and the scientific collaborations in neuroscience, chemistry, and astronomy that motivate me and keep this work grounded.  I will focus on two lines of research that I believe capture an important dichotomy in my work and in modern probabilistic modeling more generally: identifying the "best" hypothesis versus incorporating hypothesis uncertainty.  In the first case, I will discuss my recent work in Bayesian optimization, which has become the state-of-the-art technique for automatically tuning machine learning algorithms, finding use across academia and industry. In the second case, I will discuss scalable Markov chain Monte Carlo and the new technique of Firefly Monte Carlo, which is the first provably correct MCMC algorithm that can take advantage of subsets of data.

Ryan Adams is an Assistant Professor of Computer Science at Harvard University, in the School of Engineering and Applied Sciences. He leads the Harvard Intelligent Probabilistic Systems group, whose research focuses on machine learning and computational statistics, with applied collaborations across the sciences.  Ryan received his undergraduate training in EECS at MIT and completed his Ph.D. in Physics at Cambridge University as a Gates Cambridge Scholar under David MacKay.  He was a CIFAR Junior Research Fellow at the University of Toronto before joining the faculty at Harvard.  His Ph.D. thesis received Honorable Mention for the Leonard J. Savage Award for Bayesian Theory and Methods from the International Society for Bayesian Analysis.  Ryan has won paper awards at ICML, AISTATS, and UAI, and received the DARPA Young Faculty Award.
 

Decentralized Anonymous Credentials and Electronic Payments from Bitcoin

Date and Time
Wednesday, November 5, 2014 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Ed Felten

Traditionally, making statements about identity on the Internet, whether literal assertions of identity or statements about one’s identity, requires centralized providers who issue credentials attesting to the user’s information. These organizations, which include Certificate Authorities, DNS maintainers, or login providers like Google and Facebook, play a large role in securing internet infrastructure, email, and financial transactions. Our increasing reliance on these providers raises concerns about privacy and trust. 

Anonymous credentials represent a powerful solution to this privacy concern: they deprive even colluding credential issuers and verifiers of the ability to identify and track their users. Although credentials may involve direct assertions of identity, they may also be used for a large range of useful assertions, such as “my TPM says my computer is secure”, “I have a valid subscription for content”, or “I am eligible to vote.” Anonymous credentials can also be used as a basis for constructing untraceable electronic payment systems, or “e-cash".

Unfortunately most existing anonymous credential and e-cash systems have a fundamental limitation: they require the appointment of a central, trusted party to issue credentials or tokens. This issuer represents a single point of failure and an obvious target for compromise. In distributed settings such as ad hoc or peer-to-peer networks, it may be challenging even to identify parties who can be trusted to play this critical role.

In this talk I will discuss new techniques for building anonymous credentials and electronic cash in a fully decentralized setting. The basic ingredient of these proposals is a "distributed public append-only ledger", a technology which has most famously been deployed in digital currencies such as Bitcoin. This ledger can be employed by individual nodes to make assertions about a user’s attributes in a fully anonymous fashion — without the assistance of a credential issuer. One concrete result of these techniques is a new protocol named “Zerocash”, which adds cryptographically unlinkable electronic payments to the Bitcoin currency.

Prof. Matthew Green is a Research Professor at the Johns Hopkins University Information Security Institute. His research focus is on cryptographic techniques for maintaining users’ privacy, and on technologies that enable the deployment of privacy-preserving protocols. From 2004-2011, Green served as CTO of Independent Security Evaluators, a custom security evaluation firm with a global client base. Along with a team at Johns Hopkins and RSA Laboratories, he discovered flaws in the Texas Instruments Digital Signature Transponder, a cryptographically-enabled RFID device used in the Exxon Speedpass payment system and in millions of vehicle immobilizers.

Machine Learning for Robots: Perception, Planning and Motor Control

Date and Time
Monday, November 3, 2014 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Sebastian Seung

Daniel Lee

Daniel Lee

Machines today excel at seemingly complex games such as chess and Jeopardy, yet still struggle with basic perceptual, planning, and motor tasks in the physical world.  What are the appropriate representations needed to execute and adapt robust behaviors in real-time?  I will present some examples of learning algorithms from my group that have been applied to robots for monocular visual odometry, high-dimensional trajectory planning, and legged locomotion. These algorithms employ a variety of techniques central to machine learning: dimensionality reduction, online learning, and reinforcement learning.  I will show and discuss applications of these algorithms to autonomous vehicles and humanoid robots.

Daniel Lee on Comedy Central's 'The Colbert Report' October 28 2010

Daniel Lee is the Evan C Thompson Term Chair, Raymond S. Markowitz Faculty Fellow, and Professor in the School of Engineering and Applied Science at the University of Pennsylvania. He received his B.A. summa cum laude in Physics from Harvard University in 1990 and his Ph.D. in Condensed Matter Physics from the Massachusetts Institute of Technology in 1995.  Before coming to Penn, he was a researcher at AT&T and Lucent Bell Laboratories in the Theoretical Physics and Biological Computation departments.  He is a Fellow of the IEEE and has received the National Science Foundation CAREER award and the University of Pennsylvania Lindback award for distinguished teaching. He was also a fellow of the Hebrew University Institute of Advanced Studies in Jerusalem, an affiliate of the Korea Advanced Institute of Science and Technology, and organized the US-Japan National Academy of Engineering Frontiers of Engineering symposium.  As director of the GRASP Robotics Laboratory and co-director of the CMU-Penn University Transportation Center, his group focuses on understanding general computational principles in biological systems, and on applying that knowledge to build autonomous systems.

Statistical and machine learning challenges in the analysis of large networks

Date and Time
Tuesday, December 2, 2014 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Barbara Engelhardt
Network data --- i.e., collections of measurements on pairs, or tuples, of units in a population of interest --- are ubiquitous nowadays in a wide range of machine learning applications, from molecular biology to marketing on social media platforms. Surprisingly, assumptions underlying popular statistical methods are often untenable in the presence of network data. Established machine learning algorithms often break when dealing with combinatorial structure. And the classical notions of variability, sample size and ignorability take unintended connotations. These failures open to door to a number of technical challenges, and to opportunities for introducing new fundamental ideas and for developing new insights. In this talk, I will discuss open statistical and machine learning problems that arise when dealing with large networks, mostly focusing on modeling and inferential issues, and provide an overview of key technical ideas and recent results and trends.
 
Edoardo M. Airoldi is an Associate Professor of Statistics at Harvard University, where he leads the Harvard Laboratory for Applied Statistical Methodology. He holds a holds Ph.D. in Computer Science and an M.Sc. in Statistics from Carnegie Mellon University, and a B.Sc. in Mathematical Statistics and Economics from Bocconi University. His current research focuses on statistical theory and methods for designing and analyzing experiments in the presence of network interference, and on inferential issues that arise in models of network data. He works on applications in molecular biology and proteomics, and in social media analytics and marketing. Airoldi is the recipient several research awards including the ONR Young Investigator Award, the NSF CAREER Award, and the Alfred P. Sloan Research Fellowship, and has received several outstanding paper awards including the Thomas R. Ten Have Award for his work on causal inference, and the John Van Ryzin Award for his work in biology. He has recently advised the Obama for America 2012 campaign on their social media efforts, and serves as a technical advisor at Nanigans and Maxpoint.
 

The Network Inside Out: New Vantage Points for Internet Security

Date and Time
Wednesday, October 15, 2014 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Ed Felten

J. Alex Halderman

The Internet's size, and the diversity of connected hosts, create difficult challenges for security.  Conventionally, most vulnerabilities are discovered through labor-intensive scrutiny of individual implementations, but this scales poorly, and important classes of vulnerabilities can be hard to detect when considering hosts in isolation. Moreover, the security of the Internet as a whole is affected by management decisions made by individual system operators, but it is difficult to make sense of these choices--or to influence them to improve security--without a global perspective.

In recent work, I have been developing new approaches to these challenges, based on the analysis of large-scale Internet measurement data. By collecting and analyzing the public keys used for HTTPS and SSH, my team discovered serious weaknesses in key generation affecting millions of machines, and we were able to efficiently factor the RSA moduli used by almost 0.5% of all HTTPS servers. By clustering and investigating the vulnerable hosts, we exposed flawed cryptographic implementations in network devices manufactured by more than 60 companies and uncovered a critical design flaw in the Linux kernel. 

To help other researchers apply similar techniques, we developed ZMap, a tool for performing Internet-wide network surveys that can probe the entire IPv4 address space in minutes, thousands of times faster than prior approaches. ZMap has become a thriving open-source project and is available in major Linux distributions. We've used it to develop defenses against compromised HTTPS certificate authorities, to study the Internet's response to the infamous OpenSSL Heartbleed vulnerability, and to significantly increase the global rate of patching for vulnerable hosts. Ultimately, measurement-driven approaches to Internet security may help shift the security balance of power to favor defenders over attackers.

J. Alex Halderman is an assistant professor of computer science and engineering at the University of Michigan and director of Michigan's Center for Computer Security and Society. His research focuses on computer security and privacy, with an emphasis on problems that broadly impact society and public policy. Prof. Halderman's interests include application security, network security, anonymous and censorship-resistant communication, electronic voting, digital rights management, mass surveillance, and online crime, as well as the interaction of technology with law, regulatory policy, and international affairs.

Prof. Halderman is widely known for developing the "cold boot" attack against disk encryption, which altered widespread security assumptions about the behavior of RAM, influenced computer forensics practice, and inspired the creation of a new subfield of theoretical cryptography. A noted expert on electronic voting security, he helped lead the first independent review of the election technology used by half a billion voters in India, which prompted the national government to undertake major technical reforms. He has authored more than 50 publications, and his work has won numerous distinctions, including two best paper awards from Usenix Security, a top systems security venue.

Colloquium Speaker: Alex Halderman

Date and Time
Wednesday, October 15, 2014 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Ed Felten

abstract to follow.

Follow us: Facebook Twitter Linkedin