Research and Projects
My research interests span the areas of distributed systems, security,
networking, and cryptography. I particularly enjoy devising
technologies that make new functionality broadly available, often by
coupling principled designs with real-world deployments.
Ongoing research includes:
-
Peer-assisted content distribution:
Conceived and led the Coral Project. Designed and built an
Internet-scale, self-organizing web-content distribution network.
CoralCDN uses a network of cooperating DNS redirectors and HTTP
proxies, backed by a decentralized indexing infrastructure, to allow
oblivious clients to transparently download content from nearby
servers, while avoiding distant or heavily-loaded ones. CoralCDN
has been in production use on 300 servers since March 2004,
currently receiving about 25 million HTTP requests from over 1
million clients per day, serving several terabytes of data.
Ongoing research focuses on
untrusted peer-assisted CDNs, employing price theory to study
how peer demand can be efficiently matched to available supply. Our
work provides both theoretical and practical steps forward. First,
we rigorously analyze the efficiency and robustness gains that are
enabled by price-based multilateral exchange. We show that
multilateral exchanges satisfy several desirable efficiency and
robustness properties that bilateral exchanges such as BitTorrent do
not, particularly when considering multiple files. Second, our
system design, PACE (Price-Assisted Content Exchange), realizes many
of the benefits of a price-based multilateral exchange; it
encourages sharing of desirable content and network-friendly
resource utilization.
- Fault-tolerant
systems:
Researching general techniques for building robust and scalable
services. Investigating the feasibility of using historical
information of system output to detect unexpected changes given
known inputs. Designing the accompanying system for real-time fault
detection---and potentially masking if deployed on-path---in legacy
systems of heterogenous components. Also pursuing theoretical
problems within the context of scaling more traditional Byzantine
fault-tolerant services, by securely partitioning large-scale
systems into smaller groups that provide locally reliable
service. Earlier work considered a new approach for group membership
via link attestations.
-
Distributed monitoring and debugging:
Developing new techniques for monitoring and analyzing the behavior
of large-scale, federated distributed systems. First, helping
design end-to-end request tracing to graph the causality of
operations across function and network calls, providing the
equivalent of a distributed, labeled call graph. Developing
graph-based clustering algorithms to analyze millions of such
traces---we are currently tracing the production deployments of
CoralCDN and OASIS---in order to establish characterizations of
system behavior, helping operators to identify normal behavior and
isolate exceptional conditions. Second, we are investigating the
real-time adaptation of program logging driven by distributed
queries, allowing operators to dynamically change the granularity of
data collection and to perform distributed monitoring, debugging,
and condition verification.
- Privacy-preserving protocols. Developed
cryptographic protocols for private matching (PM), which computes
the set intersection between two or more parties'
inputs. PM uses the properties of homomorphic
encryption to privately evaluate a polynomial representation of
input sets. Subsequent work led to improved constructions for
keyword search (KS) based on oblivious pseudorandom
functions, as well as explicit consideration of
detecting graph proximity for social networks.
Earlier research included the design and implementation of a
prototype system for anonymous cryptographic e-cash, as well as
considerations for privacy-enabled digital rights
management (DRM) systems.
Prior research includes:
-
IP analytics:
By instrumenting CoralCDN and third-party sites,
used active web content to measure and analyze the characteristics of
over 7 million clients with respect to ``edge technologies'' (NATs,
proxies, DNS and DHCP). Results quantify how Internet services can
use IP addresses to identify clients and enforce access-control
decisions. Commercialized historical and real-time techniques for
proxy detection and IP geolocation; acquired by
Quova, Inc. in November 2006 and
currently being tested by large Internet services.
- Enterprise network security and management:
Ethane provides backwards-compatible protection and management
architecture for enterprise networks. Ethane network switches
provide connectivity through on-demand virtual circuits, yet they
enforce security policies on a per-flow basis through
centrally-managed, atomic, auditable name bindings. Multi-month
deployment at Stanford served hundreds of both wired and wireless
hosts. Technology is now being commercialized by Nicira Networks.
- Anycast:
Designed and built OASIS, a server-selection infrastructure that
provides locality- and load-based anycast for replicated Internet
services. OASIS tackles the problems of leveraging disparate
services to perform (potentially error-prone) network measurement
and of scalably managing state information about many services and
their participating nodes. OASIS has been in production use since
November 2005 and has been adopted by more than a dozen distributed
services, handling thousands of replicas. Background measurement
studies examines the geographic
locality of IP prefixes and the efficacy of virtual
coordinate systems.
-
Re: Reliable email:
An email acceptance system that leverages social proximity for
automated whitelisting, to be used preceding a mail rejection system
that performs spam filtering. Re: employs our private matching protocols to
maintain user privacy. Recent analysis of privacy for social
networks led to more efficient protocols based only on symmetric-key
operations (or achieving stronger properties using bilinear maps).
- Secure distributed file systems and file
dissemination:
With a focus on settings with mutually-distrustful clients, Shark
provides a distributed file system that improves scalability and
performance through cooperative reads--whereby mutually-distrustful
clients can exploit each others' caches to reduce load on an origin
file serve--using Coral's indexing layer to locate files. Yet Shark
preserves traditional semantics, manageability, and
security.
Other research developed a novel verification scheme that was the
first to provide integrity guarantees for large files encoded with
rateless erasure codes. It uses a homomorphic hash function and
batching to verify downloaded blocks on-the-fly.
- Anonymity systems:
Designed and implemented Tarzan, a peer-to-peer
anonymous IP network layer that is strongly resistant to traffic
analysis. Tarzan helped protect against Sybil attacks through the
selection of neighbors in a constrained and verifiable
manner. Helped design Free
Haven, a system for distributed, anonymous, persistent data
storage, robust against powerful attacks trying to find and destroy
data.
|
|