CS Department Colloquium Series | Page 12 | Computer Science Department at Princeton University

Visual Perception and Navigation in 3D Scenes

Date and Time

Thursday, April 12, 2018 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Saurabh Gupta, from University of California at Berkeley

Host

Szymon Rusinkiewicz

In recent times, computer vision has made great leaps towards 2D understanding of sparse visual snapshots of the world. This is insufficient for robots that need to exist and act in the 3D world around them based on a continuous stream of multi-modal inputs. In this talk, I will present some of my efforts in bridging this gap between computer vision and robotics. I will show how thinking about computer vision and robotics together, brings out limitations of current computer vision tasks and techniques, and motivates joint study of perception and action for robotic tasks. I will showcase these aspects via three examples: 3D scene understanding, representation learning for varied modalities, and visual navigation. I will conclude by pointing out future research directions at the intersection of computer vision and robotics, thus showing how the two fields are ready to get back together.

Bio: Saurabh Gupta is a Ph.D. student at UC Berkeley, where he is advised by Jitendra Malik. His research interests include computer vision, robotics and machine learning. His PhD work focuses on 3D scene understanding, and visual navigation. His work is supported by a Berkeley Fellowship and a Google Fellowship in Computer Vision.

Compositional Visual Intelligence

Date and Time

Thursday, March 29, 2018 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Justin Johnson, from Stanford University

Host

Olga Russakovsky

The field of computer vision has made enormous progress in the last few years, largely due to convolutional neural networks. Despite success on traditional computer vision tasks, our systems are still a long way from the general visual intelligence of people. I will argue that an important facet of visual intelligence is composition - understanding of the whole derives from an understanding of the parts. To achieve the goal of compositional visual intelligence, we must explore new computer vision tasks, create new datasets, and develop new models that exploit compositionality. I will discuss the Visual Genome dataset which we created in service of these goals, and three research directions enabled by this new data where incorporating compositionality results in systems with richer visual intelligence.

I will first discuss image captioning: traditional systems generate short sentences describing images, but by decomposing images into regions and descriptions into phrases we can that generate two types of richer descriptions: dense captions and paragraphs. Second, I will discuss visual question answering: existing datasets (including Visual Genome) consist primarily of short, simple questions; to study more complex questions requiring compositional reasoning, we built the CLEVR dataset and show that existing methods fall short on this new benchmark. We then propose an explicitly compositional model for visual question answering that internally converts questions to functional programs, and executes these programs by composing neural modules. Third, I will discuss text-to-image synthesis: existing systems can generate simple images of a single object conditioned on text descriptions, but struggle with more complex descriptions. By replacing freeform natural language with compositional scene graphs of objects and relationships, we can generate complex images containing multiple objects. I will conclude by discussing future areas where compositionality can be used to enrich visual intelligence.

Bio:
Justin is a PhD candidate in Computer Science at Stanford University, advised by Fei-Fei Li. His research interests lie at the intersection of computer vision and machine learning. Since 2015 he has co-taught a Stanford course on convolutional neural networks and deep learning with Andrej Karpathy, Serena Yeung, and Fei-Fei Li which has been viewed hundreds of thousands of times online. He received his BS in Mathematics and Computer Science at the California Institute of Technology, and during his PhD he has spent time at Google Cloud AI, Facebook AI Research, and Yahoo Research

Knowledge from Language via Deep Understanding

Date and Time

Monday, March 26, 2018 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Danqi Chen, from Stanford University

Host

Barbara Engelhardt

Almost all of human knowledge is now available online, but the vast majority of it is principally encoded in the form of human language explanations. In this talk, I explore novel neural network approaches that open up opportunities for getting a deep understanding of natural language text. First, I show how distributed representations enabled the building of a smaller, faster and more accurate dependency parser for finding the structure of sentences. Then I show how related neural technologies can be used to improve the construction of knowledge bases from text. However, maybe we don't need this intermediate step and can directly gain knowledge and answer people's questions from large textbases? In the third part, I explore this possibility by directly reading text with a simple yet highly effective neural architecture for question answering.

Bio:
Danqi Chen is a PhD student in Computer Science at Stanford University, working with Christopher Manning on deep learning approaches to natural language processing. Her research centers on how computers can achieve a deep understanding of human language and the information it contains. Danqi received Outstanding Paper Awards at ACL 2016 and EMNLP 2017, a Facebook Fellowship, a Microsoft Research Women’s Fellowship and an Outstanding Course Assistant Award from Stanford. Previously, she received her B.E. in Computer Science from Tsinghua University.

A Boundless, Fluid and Programmable Reality

Date and Time

Tuesday, February 27, 2018 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Misha Sra, from Massachusetts Institute of Technology

Host

Prof. Szymon Rusinkiewicz

Virtual reality, in the form we recognize it today, dates all the way back to 1968 when Ivan Sutherland created the first head-mounted display. Half a century later, we’ve only just begun to explore what VR might ultimately become. Imagine standing at the edge of a virtual cliff -- your heart racing and knees shaking. Cognitively, you know the edge is not there, and yet your unconscious responses say otherwise. This paradox is the root of the concept of presence. Extend this to walking, using virtual objects, and interacting with anyone, anywhere, and the power of VR starts to emerge. VR has already shown great potential across an extensive array of application domains like therapy, education, journalism, architecture, data visualization, remote collaboration, and entertainment.

My doctoral research explores novel ways of enhancing presence by incorporating spatial and sensory affordances from the physical world for single and multiuser applications. The challenge of integrating the real world with the virtual one while simultaneously trying to prevent the couch or the wall from disrupting presence can be surprisingly complex. But doing so successfully can provide a magical experience where the real and virtual worlds blend seamlessly to create something new.

In this talk, I will show different methods of integrating key elements from the user’s physical environment into the virtual experience that lead to an enhanced sense of presence. The goal is to invent a new language of “interactive experience design” based on managing perception and attention, a language as rich and varied as that of cinema. The methods aim to maintain an unbroken sense of presence despite the constraints of the physical world and the disruptions imposed by interacting with the system. I will conclude with a few research challenges and opportunities in virtual and mixed reality that I am excited about and hope to work on in the future.

Bio:
Misha Sra is a PhD student working with Prof. Pattie Maes in the Fluid Interfaces Group at the MIT Media Lab. Her research involves development of novel virtual and mixed reality experiences that integrate elements from the user's real world environment with the goal of enhancing the user's sense of presence. She is interested in applications of virtual and mixed reality in learning, healthcare, collaboration, and entertainment and is more broadly interested in data-driven design of immersive environments. Misha has published at the most selective HCI and VR venues such as CHI, UIST, IEEE VR, and VRST where she received a best paper award. From 2014-2015, she was a Robert Wood Johnson Foundation wellbeing research fellow at the MIT Media Lab. In spring 2016, she received the Silver Award in the annual Edison Awards global competition that honors excellence in human-centered design and innovation.

Interactive Systems for Code and Data Demography

Date and Time

Monday, February 26, 2018 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Elena Glassman, from University of California, Berkeley

Host

Prof. Adam Finkelstein

Programming—the means by which we tell computers what to do—has changed a lot over time. Programming today means programming alongside hundreds of fellow students, thousands of fellow professional software engineers at a particular company, or millions of fellow developers in the open-source community sharing their code online. In this talk, I will describe several interactive systems I have built that exploit the structure within large volumes of peer-produced code to help individual programmers learn how to write more correct, readable code.

These systems are made possible by code demography, which I define as statistics, algorithms, and visualizations that help people comprehend and interact with population-level structure and trends in large code corpora. The key to my approach is designing or inferring abstractions that capture critical features and abstract away variation that is irrelevant to the user. Code demography can reveal strategically diverse sets of aligned code examples which, according to theories of human concept learning, help people learn, i.e., construct mental abstractions that generalize well.

I will focus this talk on two families of systems that use program analysis, program synthesis, and visualization to either power active data-driven teaching in large programming classrooms or passive knowledge sharing within developer communities. Some of these systems have been integrated into UC Berkeley’s largest introductory programming class, which regularly enrolls over 1500 students. I will conclude with my vision for how the techniques of code demography can be generalized to more types of messy, structured, complex data corpora in order to help data scientists and enable new data-driven programming paradigms.

Bio:
Elena Glassman is an EECS postdoctoral researcher at UC Berkeley, in the Berkeley Institute of Design, funded by the NSF ExCAPE Expeditions in Computer Augmented Program Engineering grant and the Moore/Sloan Data Science Fellowship from the UC Berkeley Institute for Data Science (BIDS). She earned her PhD in EECS at the MIT CS & AI Lab in August 2016, where she created scalable systems that analyze, visualize, and provide insight into the code of thousands of programming students. She has been a summer research intern at both Google and Microsoft Research, working on systems that help people teach and learn. She recently joined the program committees of ACM CHI, ACM Learning at Scale, and two SPLASH workshops on programming usability. She was awarded the 2003 Intel Foundation Young Scientist Award, both the NSF and NDSEG graduate fellowships, the MIT EECS Oral Master’s Thesis Presentation Award, a Best of CHI Honorable Mention, and the MIT Amar Bose Teaching Fellowship for innovation in teaching methods.

Prior to entering the field of human-computer interaction (HCI), she earned her MEng in the MIT CSAIL Robot Locomotion Group and was a visiting researcher at Stanford in the Stanford Biomimetics and Dextrous Manipulation Lab.

Parting the Cloud: Empowering End-Users through Internet Transparency

Date and Time

Tuesday, February 20, 2018 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Marshini Chetty, from Princeton University

Host

Prof. David Dobkin

Why is my Internet slow? Why have I run out of data again? Is my thermostat talking to strangers? Internet users face seemingly simple questions like these every time they connect to the Internet with increasingly significant consequences for inaccurate information. Yet, as networks, applications, and devices proliferate and become more complex, questions like these are getting harder to answer. I study, design, build, and evaluate technologies to help users answer these questions, by improving the “transparency” of the networks, applications, and networked devices used to get online. My research goal is to empower Internet users by providing them with accurate and real-time information about, and control over, Internet performance, costs, privacy, and security. By doing so, my work gives Internet users the power to make well-informed decisions and agency for protecting themselves against and holding others accountable for online malfeasance. My research informs Internet policy by providing evidence of what kinds of transparency Internet users require and respond to. In this talk, I describe why Internet transparency is so important and how challenging it is to provide transparency to end-users. I present three projects that make Internet costs, online privacy and security, and Internet constraints more transparent for end-users. These projects include a system for giving adult users information and control over home broadband data usage, a prototype for helping elementary school age children learn about and manage their own Internet safety, and a mixed method approach for understanding how Internet constraints affect end-users in developing countries. I conclude the talk with open questions for making the Internet more transparent for end-users.

Bio:
Marshini Chetty is a research scholar in the Department of Computer Science at Princeton University where she directs the Princeton Human-Computer Interaction Laboratory. She specializes in human-computer interaction, usable security, and ubiquitous computing. Marshini designs, implements, and evaluates technologies to help users manage different aspects of Internet use from performance, costs to privacy and security. She often works in resource-constrained settings and uses her work to help inform Internet policy. She has a Ph.D. in Human-Centered Computing from Georgia Institute of Technology, USA and a Masters and Bachelors in Computer Science from University of Cape Town, South Africa. Prior to joining Princeton, Marshini was an assistant professor in the College of Information Studies at the University of Maryland, College Park. Her work on creating consumer-facing tools for monitoring Internet speeds won a CHI best paper award; her research has been funded by the National Science Foundation, the National Security Agency, Intel, Microsoft, and two Google Faculty Research Awards.

Interactive Systems based on Electrical Muscle Stimulation

Date and Time

Monday, February 12, 2018 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Pedro Lopes, from Hasso Plattner Institute

Host

Prof. Adam Finkelstein

How can interactive devices connect with users in the most immediate and intimate way? This question has driven interactive computing for decades. If we think back to the early days of computing, user and device were quite distant, often located in separate rooms. Then, in the ’70s, personal computers “moved in” with users. In the ’90s, mobile devices moved computing into users’ pockets. More recently, wearables brought computing into constant physical contact with the user’s skin. These transitions proved to be useful: moving closer to users and spending more time with them allowed devices to perceive more of the user, allowing devices to act more personal. The main question that drives my research is: what is the next logical step? How can computing devices become even more personal?

Some researchers argue that the next generation of interactive devices will move past the user’s skin, and be directly implanted inside the user’s body. This has already happened in that we have pacemakers, insulin pumps, etc. However, I argue that what we see is not devices moving towards the inside of the user’s body but towards the “interface” of the user’s body they need to address in order to perform their function.

This idea holds the key to more immediate and personal communication between device and user. The question is how to increase this immediacy? My approach is to create devices that intentionally borrow parts of the user’s body for input and output, rather than adding more technology to the body. I call this concept “devices that overlap with the user’s body”. I’ll demonstrate my work in which I explored one specific flavor of such devices, i.e., devices that borrow the user’s muscles.

In my research I create computing devices that interact with the user by reading and controlling muscle activity. My devices are based on medical-grade signal generators and electrodes attached to the user’s skin that send electrical impulses to the user’s muscles; these impulses then cause the user’s muscles to contract. While electrical muscle stimulation (EMS) devices have been used to regenerate lost motor functions in rehabilitation medicine since the ’60s, during my PhD I explored EMS as a means for creating interactive systems. My devices form two main categories: (1) Devices that allow users eyes-free access to information by means of their proprioceptive sense, such as a variable, a tool, or a plot. (2) Devices that increase immersion in virtual reality by simulating large forces, such as wind, physical impact, or walls and heavy objects.

Bio:
Pedro Lopes is a PhD Candidate at Prof. Baudisch’s Human Computer Interaction Lab at the Hasso Plattner Institute, Germany. Pedro’s work asks the question: what if interfaces would share part of our body? Pedro has materialized these ideas by creating interactive systems based on electrical muscle stimulation. These devices use part of the wearer’s body for output, i.e., the computer can output by actuating the user’s muscles with electrical impulses, causing it to move involuntarily. The wearer can sense the computer’s activity on their own body by means of their sense of proprioception. Pedro’s wearable systems have shown to (1) increase realism in VR, (2) provide a novel way to access information through proprioception, and (3) serve as a platform to experience and question the boundaries of our sense of agency.

Pedro’s work is published at top-tier conferences (ACM CHI &UIST) and demonstrated at venues such as ACM SIGGRAPH and IEEE Haptics. Pedro has received the ACM CHI Best Paper award for his work on Affordance++, Best Talk Awards and several nominations. As part of his research, Pedro has exhibited at Ars Electronica 2017, Science Gallery Dublin and World Economic Forum in San Francisco. His work also captured the interest of media, such as MIT Technology Review, NBC, Discovery Channel, NewScientist and Wired. (Learn more about Pedro's work here: plopes.org).

Selected Youtube links: VR Walls, Muscle Plotter, Affordance++

Architectural Techniques to Efficiently Handle Big Data Challenges

Date and Time

Thursday, January 18, 2018 - 11:00am to 12:00pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Natalie Enright Jerger, from University of Toronto

Host

Margaret Martonosi, Electrical Engineering

Abstract:
As Moore’s Law continues in the post-Dennard scaling era, architects and programmers must consider energy efficiency even more carefully as part of their designs. The energy cost of moving and storing data exceeds that of computing with that data. At the same time, we expect to see 100s of zettabytes of digital data in the next decade. This explosion of data that must be analyzes creates numerous challenges for current designs. In this talk, we will look at two techniques to address big data challenges facing computer architecture. One promising approach to boost both energy efficiency and performance is approximate computing. The approximate computing paradigm trades-off correctness for improvements in energy and/or performance by targeting key applications that do not require 100% accurate execution such as image processing and machine learning. We propose a microarchitectural technique, load value approximation that selectively predicts memory values in order to forego expensive accesses to the memory hierarchy. By predicting instead of moving data, we can save energy and improve the performance when a small amount of error is tolerable. In the second part of my talk, I will discuss the performance-cost trade-offs of interposer-based multi-chip, multi-core systems. Connecting multiple disparate chips via a silicon interposer allows us to tightly couple processors and memory within the same package for efficient data movement. I will briefly present our network solutions to realize these systems. Considering solutions that span technology, architecture and software opens up new opportunities to solve energy and performance challenges facing next generation systems.

Bio:
Natalie Enright Jerger is the Percy Edward Hart Professor of Electrical and Computer Engineering at the University of Toronto. Prior to joining the University of Toronto, she received her MSEE and PhD from the University of Wisconsin-Madison in 2004 and 2008, respectively. She received her Bachelor's degree from Purdue University in 2002. She is a recipient of the Ontario Ministry of Research and Innovation Early Researcher Award in 2012, the 2014 Ontario Professional Engineers Young Engineer Medal recipient and the 2015 Borg Early Career Award winner. She served as the program co-chair of the 7th Network-on-Chip Symposium and as the program chair of the 20th International Symposium on High Performance Computer Architecture. She is currently serving as the ACM SIGMICRO Vice Chair and an ACM SIGARCH Executive Committee member. Her current research explores on-chip networks, approximate computing, IoT architectures and machine learning acceleration. She is also passionate about increasing the representation of women in computing, particular in computer architecture. She currently chairs the organizing committee for the Women in Computer Architecture group (WICARCH). In 2017, she co-authored the second edition of the Computer Architecture Synthesis Lecture on On-Chip Networks with Li-Shiuan Peh and Tushar Krishna. Her research is supported by NSERC, Intel, CFI, AMD, Huawei and Qualcomm.

Computational Social Science: Exciting Progress and Future Challenges

Date and Time

Tuesday, December 12, 2017 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Duncan Watts, from Microsoft Research

Host

Department of Sociology

The past 15 years have witnessed a remarkable increase in both the scale and scope of social and behavioral data available to researchers, leading some to herald the emergence of a new field: “computational social science.” In this talk I highlight two areas of research that would not have been possible just a handful of years ago: first, using “big data” to study social contagion on networks; and second, using virtual labs to extend the scale, duration, and complexity of traditional lab experiments. Although these examples were all motivated by substantive problems of longstanding interest to social science, they also illustrate how new classes of data can cast these problems in new light. At the same, they illustrate some important limitations faced by our existing data generating platforms. I then conclude with some thoughts on how CSS might overcome some of these obstacles to progress.

Bio: Duncan Watts is a principal researcher at Microsoft Research and a founding member of the MSR-NYC lab. He is also an AD White Professor at Large at Cornell University. Prior to joining MSR in 2012, he was from 2000-2007 a professor of Sociology at Columbia University, and then a principal research scientist at Yahoo! Research, where he directed the Human Social Dynamics group. His research on social networks and collective dynamics has appeared in a wide range of journals, from Nature, Science, and Physical Review Letters to the American Journal of Sociology and Harvard Business Review, and has been recognized by the 2009 German Physical Society Young Scientist Award for Socio and Econophysics, the 2013 Lagrange-CRT Foundation Prize for Complexity Science, and the 2014 Everett Rogers M. Rogers Award. He is also the author of three books: Six Degrees: The Science of a Connected Age (W.W. Norton, 2003) and Small Worlds: The Dynamics of Networks between Order and Randomness (Princeton University Press, 1999), and most recently Everything is Obvious: Once You Know The Answer (Crown Business, 2011). Watts holds a B.Sc. in Physics from the Australian Defense Force Academy, from which he also received his officer’s commission in the Royal Australian Navy, and a Ph.D. in Theoretical and Applied Mechanics from Cornell University.

The blessing and the curse of the multiplicative updates - discusses connections between in evolution and the multiplicative updates of online learning

Date and Time

Monday, May 22, 2017 - 12:30pm to 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

CS Department Colloquium Series

Speaker

Prof. Manfred K. Warmuth, from University of California, Santa Cruz

Host

Prof. Elad Hazan

Multiplicative updates multiply the parameters by nonnegative factors. These updates are motivated by a Maximum Entropy Principle and they are prevalent in evolutionary processes where the parameters are for example concentrations of species and the factors are survival rates. The simplest such update is Bayes rule and we give an in vitro selection algorithm for RNA strands that implements this rule in the test tube where each RNA strand represents a different model. In one liter of the RNA soup there are approximately 10^15 different strands and therefore this is a rather high-dimensional implementation of Bayes rule.

We investigate multiplicative updates for the purpose of learning online while processing a stream of examples. The ``blessing'' of these updates is that they learn very fast in the short term because the good parameters grow exponentially. However their ``curse'' is that they learn too fast and wipe out parameters too quickly. This can have a negative effect in the long term. We describe a number of methods developed in the realm of online learning that ameliorate the curse of the multiplicative updates. The methods make the algorithm robust against data that changes over time and prevent the currently good parameters from taking over. We also discuss how the curse is circumvented by nature. Surprisingly, some of nature's methods parallel the ones developed in Machine Learning, but nature also has some additional tricks.

This will be a high level talk.
No background in online learning will be required.
We will give a number of open problems and discuss how these updates are applied for training feed forward neural nets.