Quick links

Colloquium

Data+

Date and Time
Tuesday, February 14, 2017 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Host
Amit Singer, PACM CS CSML

In this talk I will explore how data science is changing the way we practice education and research.

My title Data+ is that of a ten-week summer research experience offered by the Information Institute at Duke (iiD) in which undergraduates join small project teams, working alongside other teams in a communal environment. They learn how to marshal, analyze, and visualize data, as well as how to communicate with the client sponsoring the project. The 2016 program involved about 70 undergraduate students and 30 graduate student and postdoctoral mentors working on 25 projects. I will describe projects that illuminate different ways in which student centered enquiry can transform university institutions.

Our ambition for research is that it impacts society. We used to think of impact as starting with an idea, then developing that idea into a prototype, then turning the prototype into a product, then marketing the product, and so on – it is a long march and the problem with long marches is that most ideas don’t make it. I will describe an example of a different model, one where society is part of the research process. It is a collaboration with Apple on child mental health led by Guillermo Sapiro.
 

The Implications of Privacy-Aware Choice

Date and Time
Tuesday, February 7, 2017 - 4:30pm to 5:30pm
Location
Sherrerd Hall 101
Type
Colloquium

Privacy concerns are becoming a major obstacle to using data in the way that we want. It's often unclear how current regulations should translate into technology, and the changing legal landscape surrounding privacy can cause valuable data to go unused.  In addition, when people know that their current choices may have future consequences, they might modify their behavior to ensure that their data reveal less --- or perhaps, more favorable --- information about themselves.  Given these concerns, how can we continue to make use of potentially sensitive data, while providing satisfactory privacy guarantees to the people whose data we are using?  Answering this question requires an understanding of how people reason about their privacy and how privacy concerns affect behavior.

In this talk, we will see how strategic and human aspects of privacy interact with existing tools for data collection and analysis.  I will begin by adapting the standard model of consumer choice theory to a setting where consumers are aware of, and have preferences over, the information revealed by their choices.  In this model of privacy-aware choice, I will show that little can be inferred about a consumer's preferences once we introduce the possibility that she has concerns about privacy, even when her preferences are assumed to satisfy relatively strong structural properties.  Next, I will analyze how privacy technologies affect behavior in a simple economic model of data-driven decision making.  Intuition suggests that strengthening privacy protections will both increase utility for the individuals providing data and decrease usefulness of the computation. I will demonstrate that this intuition can fail when strategic concerns affect consumer behavior.  Finally, I'll discuss ongoing behavioral experiments, designed to empirically measure how people trade off privacy for money, and to test whether human behavior is consistent with theoretical models for the value of privacy.

Acquiring Metric and Semantic Information using Autonomous Robots

Date and Time
Tuesday, February 23, 2016 - 12:31pm
Location
Not yet determined.
Type
Colloquium

Recent years have seen impressive progress in robot control and perception including adept manipulation, aggressive quadrotor maneuvers, dense metric map reconstruction, and object recognition in real time. The grand challenge in robotics today is to capitalize on these advances in order to enable autonomy at a higher-level of intelligence.  It is compelling to envision teams of autonomous robots in environmental monitoring, precision agriculture, construction and structure inspection, security and surveillance, and search and rescue.

In this talk, I will emphasize that many such applications can be addressed by thinking about how to coordinate robots in order to extract useful information about the environment. More precisely, I will formulate a general active estimation problem that captures the common characteristics of the aforementioned scenarios. I will show how to manage the complexity of the problem over metric information spaces with respect to long planning horizons and large robot teams. These results lead to computationally scalable, non-myopic algorithms with quantified performance for problems such as distributed source seeking and active simultaneous localization and mapping (SLAM).

I will then focus on acquiring information using both metric and semantic observations (e.g., object recognition).  In this context, there are several new challenges such as missed detections, false alarms, and unknown data association.  To address them, I will model semantic observations via random sets and will discuss filtering using such models. A major contribution of our approach is in proving that the complexity of the problem is equivalent to computing the permanent of a suitable matrix. This enables us to develop and experimentally validate algorithms for semantic localization, mapping, and planning on mobile robots, Google's project Tango phone, and the KITTI visual odometry dataset.

Instruction Sets Want to Be Free: A Case for RISC-V

Date and Time
Wednesday, November 11, 2015 - 4:30pm to 5:30pm
Location
Friend Center Convocation Room
Type
Colloquium

We start by quickly reviewing 50 years of computer architecture to show there is now widespread agreement on instruction set architecture (ISA). Unlike most other fields, despite this harmony there is no open alternative to proprietary offerings from ARM and Intel.
Thus, we propose RISC-V (“RISC Five”), which targets Systems on a Chip (SoC). It has:

  • A small base of <50 classic RISC instructions that run a full open-source software stack.
  • Opcodes reserved for tailoring an SoC to applications.
  • Standard instruction extensions optionally included in an SoC.
  • Incorporated, as an open ISA, community suggestions before extensions are finalized.
  • A foundation to evolve the RISC-V slowly based solely on technical reasons voted on by members vs. by companies that inflate ISAs rapidly for business as well as technical reasons; ARM and Intel average about 2 new instructions per month.
  • No restrictions: there is no cost, no paperwork, and anyone can use it.

Attendees will get a 2-page reference card (“green card”), which lists all RISC-V extensions, to contrast this minimal ISA with the 3,600-page x86 manual and the 5,400-page ARMv8 manual.
We conclude by recapping 10 RISC-V chips built using Agile methods in just 4 years, including how shockingly cheap it is today to manufacture 100 2x2-mm, 28-nm chips. We used Chisel, a new hardware design language that reduces design effort by greatly increasing reuse.
 
David Patterson joined UC Berkeley nearly 40 years ago. He has been Director of several research labs, Chair of Berkeley’s CS Division, Chair of the Computing Research Association, and President of ACM. His most successful projects are likely Reduced Instruction Set Computers (RISC), Redundant Arrays of Inexpensive Disks (RAID), and Network of Workstations (NOW), all of which helped lead to multi-billion-dollar industries. This research led to many papers, 6 books, and about 35 honors for him and his friends, including election to the National Academy of Engineering, the National Academy of Sciences, and the Silicon Valley Engineering Hall of Fame. He shared the IEEE von Neumann Medal and the NEC C&C Prize with John Hennessy, co-author of two of his books and President of Stanford University.

The Resilience of the Perceptron

Date and Time
Monday, September 28, 2015 - 4:30pm to 5:30pm
Location
Fine Hall 214
Type
Colloquium
Host
PACM and CS

The most widely used optimization method in machine learning practice is the Perceptron Algorithm, also known as the Stochastic Gradient Method (SGM). This method has been used since the fifties to build statistical estimators, iteratively improving models by correcting errors observed on single data points. SGM is not only scalable, robust, and simple to implement, but achieves the state-of-the-art performance in many different domains. In contemporary systems, SGM powers enterprise analytics systems and is the workhorse tool used to train complex pattern-recognition systems in speech and vision.

In this talk, I will explore why SGM has had such staying power, focusing on notions of stability and robustness. I will first discuss how SGM is robust to perturbations of the model and the updates. From a computing systems perspective, this robustness enables parallel implementations with minimal communication, with no locking or synchronization, and with strong spatial locality. I will then show how SGM is robust to perturbations of the data itself, and prove that any model trained with stochastic gradient method in a reasonable amount of time attains small generalization error.  I will subsequently provide a new interpretation of common practices in neural networks, and provide a formal rationale for many popular techniques in training large, deep models.

Sustainable Networks and Systems for the Developing World

Date and Time
Tuesday, April 28, 2009 - 3:00pm to 4:30pm
Location
Friend Center 013
Type
Colloquium
Speaker
Lakshmi Subramanian , from NYU
Host
Jennifer Rexford
The Internet and the World Wide Web have largely remained urban phenomena and a significant fraction of the developing world, especially in rural and underdeveloped regions, remains disconnected from the rest of the world.

In this talk, I will briefly elaborate on our current efforts in addressing three broad challenges that can help in bridging the digital divide: (a) developing low-cost high-bandwidth connectivity solutions; (b) extending the Web to developing regions;(c) building new mobile applications for enhancing healthcare and financial services. This talk will specifically focus on the challenge of low-cost connectivity solutions for developing regions. I will describe our experiences in developing and deploying WiFi-based Long Distance (WiLD) Networks, a low-cost connectivity solution which provides the ability to achieve nearly 6-10 Mbps over 100s of kilometers for roughly $1000/link. WiLD networks have been deployed in 10 developing countries and have been used to provide telemedicine services in Aravind Eye hospitals in India to 50,000 patients/year in rural areas which is currently being expanded to cover 500,000 patients/year. We are currently extending WiLD networks to develop WiFi based Rural Extensions (WiRE), a low-cost alternative to cellular networks for rural settings.

Bio

Prof. Lakshminarayanan Subramanian is an Assistant Professor in the Courant Institute of Mathematical Sciences at NYU. His research interests are in the space of networks, distributed systems, security and technologies for developing regions. He received his MS and PhD from UC Berkeley and B.Tech from IIT-Madras.

Don't Swallow the Snake Oil: Oscar Wilde, Method Acting, and Vulnerability Assessments

Date and Time
Wednesday, April 22, 2009 - 4:30pm to 6:00pm
Location
Woolworth Center 202
Type
Colloquium
Speaker
Dr. Johnston, from Argonne National Laboratory
Host
Edward Felten
Both cyber security and physical security (but especially the latter) are plagued with poor products, flawed practice, and sloppy thinking. This talk discusses what we can learn from Oscar Wilde and Method Acting about how to think critically about security. We will then examine some of the snake oil associated with GPS, RFIDs, tamper-indicating seals, biometrics, and other “security” technologies.

Dr. Johnston is head of the Vulnerability Assessment Team at Argonne National Laboratory.

Making Change Happen: Lessons from the Obama Campaign

Date and Time
Thursday, April 16, 2009 - 4:30pm to 6:00pm
Location
Computer Science Large Auditorium (Room 104)
Type
Colloquium
Speaker
Joe Rospars
Host
Edward Felten
Mr. Rospars will talk about his experiences as the New Media Director for Barack Obama’s presidential campaign, where he oversaw all online aspects of the unprecedented fundraising, communications and grassroots mobilization effort. He led a wide-ranging program that integrated design and branding, web and video content, mass email, text messaging, and online advertising, organizing and fundraising.

Bio:

Mr. Rospars is a founding partner of Blue State Digital. Prior to the Obama campaign, Mr. Rospars led BSD’s work with Gov. Howard Dean at the Democratic National Committee; during Dean’s campaign for party chairman; and at Democracy for America. He was a writer and strategist in New Media for Dean’s 2004 Presidential campaign.

He holds a bachelor’s degree in political science from the George Washington University. CITP is a joint venture of the School of Engineering and Applied Sciences and the Woodrow Wilson School, studying digital technologies in public life. Center for the Study of Democratic Politics.

It's the Content, Stupid. How Content is driving Network Evolution

Date and Time
Monday, March 30, 2009 - 3:00pm to 4:30pm
Location
Friend Center 013
Type
Colloquium
Speaker
Kobus van der Merwe, from AT&T Lab - Research
Host
Jennifer Rexford
Security and robustness, operational complexity and dealing with demand growth are arguably the most significant challenges facing large scale networks.

In this talk I will show how Internet growth is driven by significant increases in content related traffic. Efficiently dealing with the distribution of this content presents a significant challenge to service providers. I will present our work on load-aware anycast which offers a cross-layer approach to content distribution (considering both content and network layers) .

This work is currently being readied for trial in the AT&T network and directly builds on our earlier work on Closed-loop Route Control where route selection is informed by network intelligence. As such the work presents a successful case study on the transition from research concept to production. However, this work also serves to illustrate how difficult it is to make this transition, or indeed more generally, how difficult it is to effect any change in a provider network.

In the second part of my talk I will present our recent work on a platform called ShadowNet. With ShadowNet we aim to address this fundamental challenge of network change by introducing an operational trial/test network which "sits between" a lab testbed and the production network. Specifically, each ShadowNet node is composed of a collection of carrier-grade equipment (routers, switches and servers). Each node is connected to both the Internet and other ShadowNet nodes via a (virtual) backbone. ShadowNet exploits the strong separation between logical functionality and physical realization that if offered by modern equipment. As such ShadowNet is a sharable, programmable and composable infrastructure, that allows multiple technology and service trials to be executed in parallel in a realistic operational setting. I will describe the ShadowNet control framework, our prototype implementation and deployment plans.

Bio: Kobus Van der Merwe is a Principal Member of Technical Staff in the Internet and Networking Systems Research Center at AT&T Labs - Research, where he is fortunate to be doing networking systems research that are both forward looking and sometimes find application in the AT&T network. His current research interests include network management, control and operation, network security and content distribution.

Managing the Complexity of Modern Enterprise Networks

Date and Time
Thursday, March 26, 2009 - 3:00pm to 4:30pm
Location
Friend Center 013
Type
Colloquium
Speaker
Aditya Akella, from University of Wisconsin
Host
Jennifer Rexford
Operator interviews and anecdotal evidence suggest that an operator's ability to manage a network decreases as the network becomes more complex. Today, there is no way to quantify how complex a network's design is, nor how complexity may impact network management activities. In this talk, I will present a suite of "complexity models" that describe the routing design and configuration of a network in a succinct fashion, abstracting away details of the underlying configuration languages. Our models, and the complexity metrics arising from them, capture the difficulty of configuring control and data plane behaviors on routers. They also measure the inherent complexity of the reachability constraints that a network implements via its routing design. Our models simplify network design and management by facilitating comparison between alternative designs for a network.

We tested our models on seven networks, including four universities and three enterprises. We validated the results through systematic interviews with the operators of five of the networks. We found the metrics to be predictive of the issues operators face when reconfiguring their networks. A surprising result of our study was uncovering the reasons for operators choosing the designs they did.

Given the frequency with which configuration errors are responsible for major outages, we believe that creating techniques to quantify the complexity of a network's design is an important first step to reducing that complexity. In databases, software engineering and other fields, metrics and benchmarks have driven the direction of the field by defining what is desirable and what is not. In proposing these metrics, we hope to start a similar conversation for network design.

I will end the talk with a brief description of our recent work on WAN optimization.

Follow us: Facebook Twitter Linkedin