Research and Projects

    My research interests span the areas of distributed systems, security, networking, and applied cryptography. I particularly enjoy devising technologies that make new functionality broadly available, often by coupling principled designs with real-world deployments.

    Research funded through the National Science Foundation (Awards #0831374, #0904860, #0904729), the Office of Naval Research (Young Investigator Program), Cisco Systems, and Princeton University (Grand Challenges Program and Emerson Electric Award).


    Ongoing research includes:

  • Peer-assisted content distribution:   Conceived and led the Coral Project. Designed and built an Internet-scale, self-organizing web-content distribution network. CoralCDN uses a network of cooperating DNS redirectors and HTTP proxies, backed by a decentralized indexing infrastructure, to allow oblivious clients to transparently download content from nearby servers, while avoiding distant or heavily-loaded ones. CoralCDN has been in production use on 300 servers since March 2004, currently receiving about 25 million HTTP requests from over 1 million clients per day, serving several terabytes of data.

    Ongoing research focuses on untrusted peer-assisted CDNs, employing price theory to study how peer demand can be efficiently matched to available supply. This work characterizes the efficiency and robustness gains that are enabled by price-based multilateral exchange, as opposed to bilateral exchanges such as BitTorrent. The corresponding system mechanisms of PACE (Price-Assisted Content Exchange) encourages both the sharing of desirable content and network-friendly resource utilization.

    Leveraging these techniques for peer selection and resource allocation, we are currently building a new browser-based P2P-CDN called FireCoral. FireCoral enables mutually-distrustful users to ``share'' their browser caches, yet ensures the authenticity of content and enables users to preserve privacy by expressing flexible content sharing policies.

  • Federated virtual worlds:   Researching design issues---scalable, secure system and network architectures---for building Internet-scale virtual worlds. We specially focus on the challenges with supporting large-scale worlds; providing immersive visual, audio, and physics experiences; enforcing spam-resistant communication models; and federated world providers.

  • Service-centric network architectures:   Designing and developing a new network architecture, SCAFFOLD, that argues for directly addressing (potentially distributed or replicated) objects or services, rather than hosts, and more tightly coupling naming and routing. Rather than employing ad-hoc techniques at various network layers, SCAFFOLD directly focuses on handling the many kinds of churn---from failures, planned maintenance, load balancing, workload migration, and physical mobility---that arise in networked services.

  • Scalable storage systems:   Developing a datacenter-focused storage system---CRAQ (Chain Replication with Apportioned Queries)---capable of good availability, high throughput, and low latency, while providing a sliding scale of read consistency options (from eventual to strong consistency guarantees). Special consideration is given to high-performance support for cross-data-center replication, using geo-diversity both for performance and fault-tolerance reasons.

  • Fault-tolerant systems:   Researching general techniques for building robust and scalable services. Investigating the feasibility of using historical information of system output to detect unexpected changes given known inputs. Prophecy implements a new form of consistency called delay-once consistency which, while slightly weaker than the linearizability of Byzantine fault tolerant or quorum systems, can be achieved at a cost almost identical to fully unreplicated reads in read-mostly workloads. Earlier work considered a new approach for group membership via link attestations.

  • Distributed monitoring and debugging:   Developing new techniques for monitoring and analyzing the behavior of large-scale, federated distributed systems. Helping design end-to-end request tracing to graph the causality of operations across function and network calls, providing the equivalent of a distributed, labeled call graph. Currently tracing the production deployments of CoralCDN and OASIS, we are developing automated mechanisms to detect exceptional conditions.

  • Privacy-preserving protocols. Developed cryptographic protocols for private matching (PM), which computes the set intersection between two or more parties' inputs. PM uses the properties of homomorphic encryption to privately evaluate a polynomial representation of input sets. Subsequent work led to improved constructions for keyword search (KS) based on oblivious pseudorandom functions, as well as explicit consideration of detecting graph proximity for social networks.

    Ongoing work looks at greatly improving performance for large multi-party computation by relying on more centralized infrastructure, that maintains both strong participant and data privacy by dividing trust assumptions between two non-colluding service providers.

    Earlier research included the design and implementation of a prototype system for anonymous cryptographic e-cash, as well as considerations for privacy-enabled digital rights management (DRM) systems.

    Prior research includes:

  • IP analytics:   By instrumenting CoralCDN and third-party sites, used active web content to measure and analyze the characteristics of over 7 million clients with respect to ``edge technologies'' (NATs, proxies, DNS and DHCP). Results quantify how Internet services can use IP addresses to identify clients and enforce access-control decisions. Commercialized historical and real-time techniques for proxy detection and IP geolocation; acquired by Quova, Inc. in November 2006 and currently being tested by large Internet services.

  • Enterprise network security and management:   Ethane provides backwards-compatible protection and management architecture for enterprise networks. Ethane network switches provide connectivity through on-demand virtual circuits, yet they enforce security policies on a per-flow basis through centrally-managed, atomic, auditable name bindings. Multi-month deployment at Stanford served hundreds of both wired and wireless hosts. Technology is now being commercialized by Nicira Networks and supported by major switch vendors as part of the OpenFlow Consortium.

  • Anycast:   Designed and built OASIS, a server-selection infrastructure that provides locality- and load-based anycast for replicated Internet services. OASIS tackles the problems of leveraging disparate services to perform (potentially error-prone) network measurement and of scalably managing state information about many services and their participating nodes. OASIS has been in production use since November 2005 and has been adopted by more than a dozen distributed services, handling thousands of replicas. Background measurement studies examines the geographic locality of IP prefixes and the efficacy of virtual coordinate systems.

  • Re: Reliable email:   An email acceptance system that leverages social proximity for automated whitelisting, to be used preceding a mail rejection system that performs spam filtering. Re: employs our private matching protocols to maintain user privacy. Recent analysis of privacy for social networks led to more efficient protocols based only on symmetric-key operations (or achieving stronger properties using bilinear maps).

  • Secure distributed file systems and file dissemination:   With a focus on settings with mutually-distrustful clients, Shark provides a distributed file system that improves scalability and performance through cooperative reads--whereby mutually-distrustful clients can exploit each others' caches to reduce load on an origin file serve--using Coral's indexing layer to locate files. Yet Shark preserves traditional semantics, manageability, and security.

    Other research developed a novel verification scheme that was the first to provide integrity guarantees for large files encoded with rateless erasure codes. It uses a homomorphic hash function and batching to verify downloaded blocks on-the-fly.

  • Anonymity systems:   Designed and implemented Tarzan, a peer-to-peer anonymous IP network layer that is strongly resistant to traffic analysis. Tarzan helped protect against Sybil attacks through the selection of neighbors in a constrained and verifiable manner. Helped design Free Haven, a system for distributed, anonymous, persistent data storage, robust against powerful attacks trying to find and destroy data.