Talk | Page 10 | Computer Science Department at Princeton University

Bug Bytes: Bioinformatics for Metagenomics and Microbial Community Analysis

Date and Time

Wednesday, August 22, 2012 - 11:00am to 12:00pm

Location

Carl Icahn Lab, 208

Type

Talk

Speaker

Curtis Huttenhower*08, from Harvard University

Host

Olga Troyanskaya

Among many surprising insights, the genomic revolution has helped us to realize that we're never alone and, in fact, barely human. For most of our lives, we share our bodies with some ten times as many microbes as human cells; these are resident in our gut and on nearly every body surface, and they are responsible for a tremendous diversity of metabolic activity, immunomodulation, and intercellular signaling.

These microbial communities have only recently become well-described using high-throughput sequencing, requiring analyses that simultaneously apply techniques from genomics, "big data" mining, and molecular epidemiology. I will discuss emerging end-to-end bioinformatics approaches for metagenomics, including initial handling of sequence data for mixed microbial communities, its reconstruction into metabolic pathways, and biomarker discovery in disease. In particular, computational processing is key in identifying unique markers for microbial taxonomy, phylogeny, and in identifying genes and pathways significantly disrupted in inflammatory conditions such as Crohn's and ulcerative colitis.

The FloodLight OpenFlow Controller

Date and Time

Tuesday, May 29, 2012 - 11:00am to 12:00pm

Location

Computer Science 402

Type

Talk

Speaker

Mike Cohen, from BigSwitch Networks

Host

Jennifer Rexford

Software-Defined Networking (SDN) and OpenFlow are quickly emerging as powerful trends in the networking industry. By separating the control and data planes, these technologies are accelerating the innovation cycle and creating opportunities for new application research and development. This talk will provide an overview of SDN and OpenFlow and introduce Floodlight, an Apache-licensed OpenFlow controller. Floodlight is driven by a community of developers including Big Switch Networks and was designed to offer a fast, easy-to-use programming environment, a rich set of development tools, an interoperability with a wide array of networking devices. The talk will also discuss other useful SDN development tools and highlight active areas for research and development.

Mike Cohen is a product manager at Big Switch Networks. He was previously an early engineer on VMware’s hypervisor team and a search infrastructure product manager at Google. Mike holds a B.S.E in Electrical Engineering from Princeton and Masters from Harvard.

Scalable Label Assignment in Data Center Networks

Date and Time

Tuesday, May 1, 2012 - 11:00am to 12:00pm

Location

Computer Science 402

Type

Talk

Speaker

Meg Walraed-Sullivan, from University of California, San Diego

Host

Jennifer Rexford

Modern data centers can consist of hundreds of thousands of servers and millions of virtualized end hosts. A key challenge in providing scalable communication in the data center is assigning identifiers, or labels, to network elements and servers so that they can efficiently communicate and perform cooperative tasks. The scale and complexity of a data center makes the labeling problem unique in this environment and solutions often resort to manual configuration that is costly, time-consuming, and error prone. In this talk, I will present ALIAS, a distributed protocol for topology discovery and label assignment in data center networks. ALIAS automates the assignment of topologically meaningful addresses to the nodes in a data center, enabling scalable communication while significantly reducing the management burden of manual configuration at scale.

Meg is a PhD candidate in the Department of Computer Science and Engineering at the University of California, San Diego, working with Amin Vahdat and Keith Marzullo. Her research interests are rooted in distributed systems and algorithms. She has recently applied this investigation to the data center, enabling scalable communication via strategic label assignment and exploring the relationship between fault tolerance and key properties of hierarchical topologies. Meg received a B.S. and an M.Eng in Electrical and Computer Engineering from Cornell University and will complete her PhD in Computer Science in the Summer of 2012.

Will The Global Village Fracture into Tribes: The Impact of Recommender Systems on Consumers

Date and Time

Thursday, April 19, 2012 - 4:30pm to 5:30pm

Location

Engineering Quadrangle B205

Type

Talk

Speaker

Kartik Hosanagar, from The Wharton School

Recommender systems personalize the browsing and consumption experience to each user's taste in environments with many product choices. Popular applications include product recommendations at e-commerce sites and online newspapers' automated selection of articles to display based on the current reader's interests. This ability to focus more closely on one's taste and filter all else out has spawned criticism that recommenders will fragment consumers. Critics say recommenders cause consumers to have less in common with one another and that the media should do more to increase exposure to a variety of content. We present an empirical study of recommender systems in the music industry. In contrast to concerns that users are becoming more fragmented, we find that in our setting users' purchases become more similar to one another. This increase in purchase similarity occurs for two reasons, which we term volume and taste effects. The volume effect is that consumers simply purchase more after recommendations, increasing the chance of having more purchases in common. The taste effect is that, conditional on volume, consumers buy a more similar mix of products after recommendations. When we view consumers' purchases as a similarity network before versus after recommendations, we find that the network becomes denser and smaller, or characterized by shorter inter-user distances.

Kartik Hosanagar is an Associate Professor at The Wharton School of the University of Pennsylvania. Kartik ´s research work focuses on Internet media and Internet marketing. Kartik has been recognized as one of the world's top 40 business professors under 40. He has received several teaching awards including the MBA and Undergraduate Excellence in Teaching awards at the Wharton School. His research has received several awards, including the William Cooper award for best thesis in Management Science. Kartik is a cofounder of Yodle Inc, a venture-backed firm recently listed among the top 50 fastest growing private firms in the US. Kartik holds a PhD in Management Science and Information Systems from Carnegie Mellon University.

The paper is available for download at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1321962

From Theory to Practice in Wireless Multimedia Delivery

Date and Time

Wednesday, April 4, 2012 - 11:00am to 12:00pm

Location

Computer Science 302

Type

Talk

Speaker

Amitabh Ghosh, from Princeton University

Host

Jennifer Rexford

The sharing and distribution of multimedia content over the Internet has taken a momentous turn, keeping pace with the increasing market for wireless handhelds. Recent predictions (Cisco VNI) indicate that mobile video traffic will comprise more than 70% of all mobile data traffic by 2016. Towards this end, many ISPs and content providers are taking several measures, including usage-based tiered pricing, throttling bandwidth for top data users, and off-loading cellular traffic on to Wi-Fi networks. However, multimedia processing as well as its generation present a set of degrees of freedom that could be leveraged in bridging this widening gap between content providers and pipe providers.

In this talk, I will describe two of my recent work that address the problem of efficiently delivering multimedia content over cellular networks. In particular, the first part of the talk focuses on content-aware networking, that is, how one could exploit the structure of the video content to design adaptive content-aware protocols that are distortion-fair rather than just bandwidth-fair. The second part of the talk focuses on the related issue of generating network-aware content by leveraging video compressibility, consumer usage patterns, and monthly quota. On both these topics, I will present some implementation details on software defined radios, Android devices, as well as real-world trial results obtained from Princeton community volunteers.

Symantec's WINE System for Repeatable, Data-Intensive Experiments in Cyber Security

Date and Time

Wednesday, March 28, 2012 - 12:30pm to 1:30pm

Location

Sherrerd Hall 306

Type

Talk

Speaker

Tudor Dumitras, from Symantec

Host

Jennifer Rexford

The Worldwide Intelligence Network Environment (WINE) is a platform, developed at Symantec Research Labs (SRL), for conducting data intensive experiments in cyber security. We have built WINE focusing on the challenges for aggregating multiple terabyte-size data feeds, which Symantec uses in its day-to-day operations, and for supporting open-ended experiments at scale. WINE also enables the reproduction of prior experimental results, by archiving the reference data sets that researchers use and by recording information on the data collection process and on the experimental procedures employed.

The need for such a platform arose from SRL’s program for sharing field data, collected by Symantec on millions of hosts worldwide, with researchers in academia. For example, WINE includes historical information on unknown binaries found on the Internet—providing unique insights into the origins and prevalence of zero-day attacks—as well as telemetry from Symantec’s anti-virus products—indicating the effectiveness of defensive mechanisms (e.g., security patches, anti-virus signatures). In addition to cyber security, the WINE data is relevant to research in machine learning, mobile computing, software reliability, storage systems, and visual analytics. In this talk, I will also discuss the challenges for sharing sensitive data and for establishing a rigorous benchmark for cyber security.

Tudor Dumitraș is a senior research engineer at Symantec Research Labs (SRL), currently building the Worldwide Intelligence Network Environment (WINE). Tudor's prior research focused on improving the dependability of large-scale distributed systems (addressing operator errors during software upgrades), of enterprise systems (addressing the predictability of fault-tolerant middleware), and of embedded systems (addressing soft errors in networks-on-chip). He received the 2011 A. G. Jordan Award, from the ECE Department at Carnegie Mellon University, for an outstanding Ph.D. thesis and for service to the community, the 2009 John Vlissides Award, from ACM SIGPLAN, for showing significant promise in applied software research, and the Best Paper Award at ASP-DAC'03. Tudor holds a Ph.D. degree from Carnegie Mellon University.

Smarter Energy: The Promise of Cyber-Physical Systems

Date and Time

Friday, March 30, 2012 - 11:00am to 12:00pm

Location

Computer Science Small Auditorium (Room 105)

Type

Talk

Speaker

Shivkumar Kalyanaraman, from IBM Research

Host

Jennifer Rexford

This talk will review how the climate change problem is linked to the fossil-fuel energy problem, and overview various options for sustainable energy and their relative contributions. We then discuss smarter energy, and how sensing, networking, real-time analytics, actuation and control come together in a "cyber-physical" system. The talk will discuss key IBM Research and some commercial initiatives worldwide -- ranging from deep rainforest microgrids in Brunei, smart grid tomography in Australia, integration of fickle wind energy and plug-in hybrid electric vehicles (PHEV) in Denmark, context-sensitive "smart" demand-response systems for residential use.

Shivkumar Kalyanaraman is a Senior Technical Staff Member (STSM) & Senior Manager of the Next Gen Systems & Smarter Planet Solutions Department at IBM India Research Labs, Bangalore. He was previously a Manager of the Next Generation Telecom Research group and a Research Staff Member since 2008. He was a full Professor at the Department of Electrical, Computer and Systems Engineering at Rensselaer Polytechnic Institute in Troy, NY. He received a B.Tech degree in Computer Science from the Indian Institute of Technology, Madras, India in July 1993, followed by M.S. and Ph.D. degrees at the Ohio State University in 1994 and 1997 respectively. He also holds an Executive M.B.A. (EMBA) degree from Rensselaer Polytechnic Institute (2005). His current research in IBM is at the intersection of emerging wireless technologies and IBM middleware and systems technologies with applications to large-scale smarter planet problems (grids, traffic, finance etc). He was selected by MIT's Technology Review Magazine in 1999 as one of the top 100 young innovators for the new millenium. He served as the TPC Co-chair of IEEE INFOCOM 2008, and General co-chair of ACM SIGCOMM 2010 in New Delhi. He is on the editorial board of IEEE/ACM Transactions of Networking. He is a Fellow of the IEEE, and an ACM Distinguished Scientist.

Network Virtualization for Large Data Centers and Enterprises

Date and Time

Wednesday, April 11, 2012 - 4:30pm to 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

Talk

Speaker

Dr. Changhoon Kim, from Microsoft Azure

Host

Jennifer Rexford

Data centers are the digital-era analogue of factories and gaining huge popularity among service providers and enterprises. The golden rule of designing and operating a data center is maximizing the amount of useful work per dollar spent. To meet this goal, the most desirable technical feature is agility—the ability to assign any computing resources to any tenants any time. Anything less inevitably results in stranded resources and poor performance perceived by data-center users.

In this talk, I first show how and why conventional networks specifically designed for large data centers inhibit, rather than facilitate, agility. Location-dependent constraints, huge and unpredictable performance variances, and poor scalability are the main culprits. Then I put forward network virtualization as key architecture that ensures agility at scale. The core properties of my network-virtualization architecture—abstraction, isolation, and efficiency—allow service providers to build a network that is not susceptible to the location dependencies and poor scalability in the first place, eliminating the huge burden of complicated and less effective cross-layer optimizations needed to outmaneuver the constraints. Then I explain how I turn this architecture into an operational system that virtualizes mega data-center networks for a real-world cloud service. In particular, I show how my virtualization architecture and its realization uniquely take advantage of a few critical opportunities and technical trends available in data centers, ranging from the power of a software switch present in every hypervisor, to the principle of separating network state from host state, and to the availability of commodity switching ASICs. Finally I evaluate how faithfully the resulting system meets the goal of offering a powerful and simple virtual-network abstraction—an imaginary switch that can host as many servers as a customer desires, offering predictably and uniformly high capacity between any servers under any traffic patterns, and yet dedicated only to the customer.

Changhoon Kim works at Windows Azure, Microsoft's cloud-service division, and leads research and engineering projects on the architecture, performance, management, and operation of datacenter and enterprise networks. His research themes span network virtualization, self-configuring networks, and debugging and diagnosis of large-scale distributed systems. Changhoon received Ph.D. from Princeton University in 2009, where he worked with Prof. Jennifer Rexford. Many of his research outcomes (including SEATTLE, VL2, VNet, Seawall, and the relay-routing technology for VPNs) are either directly adopted by production service providers or under review by standard bodies, such as IETF. In particular, his VL2 work was published in the Research Highlights section of the Communications of the ACM (CACM) as an invited paper, which the editors recognize as "one of the most important research results published in CS in recent years".

The Bloom Paradox: When not to Use a Bloom Filter?

Date and Time

Monday, April 2, 2012 - 12:30pm to 1:20pm

Location

Computer Science 302

Type

Talk

Speaker

Ori Rottenstreich, from Technion

Host

Jennifer Rexford

Bloom filters and counting Bloom filters (CBFs) are widely used in networking device algorithms. They implement fast set representations to support membership queries with limited error. Unlike Bloom filters, CBFs also support element deletions.

In this talk, we uncover the Bloom paradox in Bloom filters: sometimes, it is better to disregard the query results of Bloom filters, and in fact not to even query them, thus making them useless. We analyze conditions under which the Bloom paradox occurs in a Bloom filter, and demonstrate that it depends on the a priori probability that a given element belongs to the represented set. We show that the Bloom paradox also applies to CBFs, and depends on the product of the hashed counters of each element.

Improving network agility with seamless BGP reconfigurations

Date and Time

Monday, April 2, 2012 - 12:30pm to 1:20pm

Location

Computer Science 302

Type

Talk

Speaker

Laurent Vanbever, from UCL (Louvain-la-Neuve, Belgium)

Host

Jennifer Rexford

Today, the network infrastructure of Internet Service Providers (ISPs) undergoes constant evolution. Whenever new requirements arise (e.g., the deployment of a new point-of-presence, or a change in the business relationships with a neighboring ISP), network operators need to modify the BGP configuration of their network. Due to the complexity of BGP, and the lack of methodologies and tools, maintaining service availability during reconfigurations that involve BGP is a challenge for network operators.

In this talk, we address the problem of deploying a new BGP configuration in a running ISP with no impact on the data-plane traffic. First, we show that the current best practices to reconfigure BGP (eBGP and iBGP) do not provide guarantees with respect to packet loss. In particular, we show that long-lasting routing and forwarding anomalies can occur even when the initial and the final BGP configurations are anomaly-free. Then, we study the problem of finding an operational ordering of the reconfiguration steps which guarantees no packet loss. Unfortunately, such an operational ordering, when it exists, is computationally hard to find. Finally, to enable disruption-free reconfigurations, we propose a framework which extends current carrier-grade routers to run two BGP control-planes in parallel. We present a prototype implementation and show its effectiveness through a case study.