Research and Projects

    My research interests span the areas of distributed systems, security, networking, and cryptography. I particularly enjoy devising technologies that make new functionality broadly available, often by coupling principled designs with real-world deployments.


    Ongoing research includes:

  • Peer-assisted content distribution:   Conceived and led the Coral Project. Designed and built an Internet-scale, self-organizing web-content distribution network. CoralCDN uses a network of cooperating DNS redirectors and HTTP proxies, backed by a decentralized indexing infrastructure, to allow oblivious clients to transparently download content from nearby servers, while avoiding distant or heavily-loaded ones. CoralCDN has been in production use on 300 servers since March 2004, currently receiving about 25 million HTTP requests from over 1 million clients per day, serving several terabytes of data.

    Ongoing research focuses on untrusted peer-assisted CDNs, employing price theory to study how peer demand can be efficiently matched to available supply. Our work provides both theoretical and practical steps forward. First, we rigorously analyze the efficiency and robustness gains that are enabled by price-based multilateral exchange. We show that multilateral exchanges satisfy several desirable efficiency and robustness properties that bilateral exchanges such as BitTorrent do not, particularly when considering multiple files. Second, our system design, PACE (Price-Assisted Content Exchange), realizes many of the benefits of a price-based multilateral exchange; it encourages sharing of desirable content and network-friendly resource utilization.

  • Fault-tolerant systems:   Researching general techniques for building robust and scalable services. Investigating the feasibility of using historical information of system output to detect unexpected changes given known inputs. Designing the accompanying system for real-time fault detection---and potentially masking if deployed on-path---in legacy systems of heterogenous components. Also pursuing theoretical problems within the context of scaling more traditional Byzantine fault-tolerant services, by securely partitioning large-scale systems into smaller groups that provide locally reliable service. Earlier work considered a new approach for group membership via link attestations.

  • Distributed monitoring and debugging:   Developing new techniques for monitoring and analyzing the behavior of large-scale, federated distributed systems. First, helping design end-to-end request tracing to graph the causality of operations across function and network calls, providing the equivalent of a distributed, labeled call graph. Developing graph-based clustering algorithms to analyze millions of such traces---we are currently tracing the production deployments of CoralCDN and OASIS---in order to establish characterizations of system behavior, helping operators to identify normal behavior and isolate exceptional conditions. Second, we are investigating the real-time adaptation of program logging driven by distributed queries, allowing operators to dynamically change the granularity of data collection and to perform distributed monitoring, debugging, and condition verification.

  • Privacy-preserving protocols. Developed cryptographic protocols for private matching (PM), which computes the set intersection between two or more parties' inputs. PM uses the properties of homomorphic encryption to privately evaluate a polynomial representation of input sets. Subsequent work led to improved constructions for keyword search (KS) based on oblivious pseudorandom functions, as well as explicit consideration of detecting graph proximity for social networks.

    Earlier research included the design and implementation of a prototype system for anonymous cryptographic e-cash, as well as considerations for privacy-enabled digital rights management (DRM) systems.

    Prior research includes:

  • IP analytics:   By instrumenting CoralCDN and third-party sites, used active web content to measure and analyze the characteristics of over 7 million clients with respect to ``edge technologies'' (NATs, proxies, DNS and DHCP). Results quantify how Internet services can use IP addresses to identify clients and enforce access-control decisions. Commercialized historical and real-time techniques for proxy detection and IP geolocation; acquired by Quova, Inc. in November 2006 and currently being tested by large Internet services.

  • Enterprise network security and management:   Ethane provides backwards-compatible protection and management architecture for enterprise networks. Ethane network switches provide connectivity through on-demand virtual circuits, yet they enforce security policies on a per-flow basis through centrally-managed, atomic, auditable name bindings. Multi-month deployment at Stanford served hundreds of both wired and wireless hosts. Technology is now being commercialized by Nicira Networks.

  • Anycast:   Designed and built OASIS, a server-selection infrastructure that provides locality- and load-based anycast for replicated Internet services. OASIS tackles the problems of leveraging disparate services to perform (potentially error-prone) network measurement and of scalably managing state information about many services and their participating nodes. OASIS has been in production use since November 2005 and has been adopted by more than a dozen distributed services, handling thousands of replicas. Background measurement studies examines the geographic locality of IP prefixes and the efficacy of virtual coordinate systems.

  • Re: Reliable email:   An email acceptance system that leverages social proximity for automated whitelisting, to be used preceding a mail rejection system that performs spam filtering. Re: employs our private matching protocols to maintain user privacy. Recent analysis of privacy for social networks led to more efficient protocols based only on symmetric-key operations (or achieving stronger properties using bilinear maps).

  • Secure distributed file systems and file dissemination:   With a focus on settings with mutually-distrustful clients, Shark provides a distributed file system that improves scalability and performance through cooperative reads--whereby mutually-distrustful clients can exploit each others' caches to reduce load on an origin file serve--using Coral's indexing layer to locate files. Yet Shark preserves traditional semantics, manageability, and security.

    Other research developed a novel verification scheme that was the first to provide integrity guarantees for large files encoded with rateless erasure codes. It uses a homomorphic hash function and batching to verify downloaded blocks on-the-fly.

  • Anonymity systems:   Designed and implemented Tarzan, a peer-to-peer anonymous IP network layer that is strongly resistant to traffic analysis. Tarzan helped protect against Sybil attacks through the selection of neighbors in a constrained and verifiable manner. Helped design Free Haven, a system for distributed, anonymous, persistent data storage, robust against powerful attacks trying to find and destroy data.