Research and Projects
My research interests span the areas of distributed systems, security,
networking, and applied cryptography. I particularly enjoy devising
technologies that make new functionality broadly available, often by
coupling principled designs with real-world deployments.
Research funded through the National Science Foundation (Awards #0831374, #0904860, #0904729), the Office of Naval Research (Young Investigator Program), Cisco Systems, and Princeton University (Grand Challenges Program and Emerson Electric Award).
Ongoing research includes:
-
Peer-assisted content distribution:
Conceived and led the Coral Project. Designed and built an
Internet-scale, self-organizing web-content distribution network.
CoralCDN uses a network of cooperating DNS redirectors and HTTP
proxies, backed by a decentralized indexing infrastructure, to allow
oblivious clients to transparently download content from nearby
servers, while avoiding distant or heavily-loaded ones. CoralCDN
has been in production use on 300 servers since March 2004,
currently receiving about 25 million HTTP requests from over 1
million clients per day, serving several terabytes of data.
Ongoing research focuses on
untrusted peer-assisted CDNs, employing price theory to study
how peer demand can be efficiently matched to available supply.
This work characterizes the efficiency and robustness gains that are
enabled by price-based multilateral exchange, as opposed to
bilateral exchanges such as BitTorrent. The corresponding system
mechanisms of PACE (Price-Assisted Content Exchange) encourages both
the sharing of desirable content and network-friendly resource
utilization.
Leveraging these techniques for peer selection and resource
allocation, we are currently building a new browser-based P2P-CDN
called FireCoral. FireCoral enables mutually-distrustful users to
``share'' their browser caches, yet ensures the authenticity of
content and enables users to preserve privacy by expressing flexible
content sharing policies.
- Federated virtual
worlds: Researching design issues---scalable,
secure system and network architectures---for building
Internet-scale virtual worlds. We specially focus on the challenges
with supporting large-scale worlds; providing immersive visual,
audio, and physics experiences; enforcing spam-resistant
communication models; and federated world providers.
- Service-centric network
architectures: Designing and developing a new
network architecture, SCAFFOLD, that argues for directly addressing
(potentially distributed or replicated) objects or services, rather
than hosts, and more tightly coupling naming and routing. Rather
than employing ad-hoc techniques at various network layers, SCAFFOLD
directly focuses on handling the many kinds of churn---from
failures, planned maintenance, load balancing, workload migration,
and physical mobility---that arise in networked services.
- Scalable storage
systems: Developing a datacenter-focused storage
system---CRAQ (Chain Replication with Apportioned Queries)---capable
of good availability, high throughput, and low latency, while
providing a sliding scale of read consistency options (from eventual
to strong consistency guarantees). Special consideration is given
to high-performance support for cross-data-center replication, using
geo-diversity both for performance and fault-tolerance reasons.
- Fault-tolerant
systems: Researching general techniques for
building robust and scalable services. Investigating the
feasibility of using historical information of system output to
detect unexpected changes given known inputs. Prophecy implements a
new form of consistency called delay-once consistency which,
while slightly weaker than the linearizability of Byzantine fault
tolerant or quorum systems, can be achieved at a cost almost
identical to fully unreplicated reads in read-mostly workloads.
Earlier work considered a new approach for group membership via link attestations.
- Distributed monitoring and
debugging: Developing new techniques for
monitoring and analyzing the behavior of large-scale, federated
distributed systems. Helping design end-to-end request tracing to
graph the causality of operations across function and network calls,
providing the equivalent of a distributed, labeled call graph.
Currently tracing the production deployments of CoralCDN and OASIS,
we are developing automated mechanisms to detect exceptional
conditions.
- Privacy-preserving protocols. Developed
cryptographic protocols for private matching (PM), which computes
the set intersection between two or more parties'
inputs. PM uses the properties of homomorphic
encryption to privately evaluate a polynomial representation of
input sets. Subsequent work led to improved constructions for
keyword search (KS) based on oblivious pseudorandom
functions, as well as explicit consideration of
detecting graph proximity for social networks.
Ongoing work looks at greatly improving performance for large
multi-party computation by relying on more centralized
infrastructure, that maintains both strong participant and data
privacy by dividing trust assumptions between two non-colluding
service providers.
Earlier research included the design and implementation of a
prototype system for anonymous cryptographic e-cash, as well as
considerations for privacy-enabled digital rights
management (DRM) systems.
Prior research includes:
- IP analytics: By instrumenting CoralCDN
and third-party
sites, used active web content to measure and analyze the
characteristics of over 7 million clients with respect to ``edge
technologies'' (NATs, proxies, DNS and DHCP). Results quantify how
Internet services can use IP addresses to identify clients and
enforce access-control decisions. Commercialized historical and
real-time techniques for proxy detection and IP geolocation; acquired
by Quova, Inc. in November 2006
and currently being tested by large Internet services.
- Enterprise network security and management:
Ethane provides backwards-compatible protection and
management architecture for enterprise networks. Ethane network
switches provide connectivity through on-demand virtual circuits,
yet they enforce security policies on a per-flow basis through
centrally-managed, atomic, auditable name bindings. Multi-month
deployment at Stanford served hundreds of both wired and wireless
hosts. Technology is now being commercialized by Nicira Networks and supported by
major switch vendors as part of the OpenFlow Consortium.
- Anycast:
Designed and built OASIS, a server-selection infrastructure that
provides locality- and load-based anycast for replicated Internet
services. OASIS tackles the problems of leveraging disparate
services to perform (potentially error-prone) network measurement
and of scalably managing state information about many services and
their participating nodes. OASIS has been in production use since
November 2005 and has been adopted by more than a dozen distributed
services, handling thousands of replicas. Background measurement
studies examines the geographic
locality of IP prefixes and the efficacy of virtual
coordinate systems.
-
Re: Reliable email:
An email acceptance system that leverages social proximity for
automated whitelisting, to be used preceding a mail rejection system
that performs spam filtering. Re: employs our private matching protocols to
maintain user privacy. Recent analysis of privacy for social
networks led to more efficient protocols based only on symmetric-key
operations (or achieving stronger properties using bilinear maps).
- Secure distributed file systems and file
dissemination:
With a focus on settings with mutually-distrustful clients, Shark
provides a distributed file system that improves scalability and
performance through cooperative reads--whereby mutually-distrustful
clients can exploit each others' caches to reduce load on an origin
file serve--using Coral's indexing layer to locate files. Yet Shark
preserves traditional semantics, manageability, and
security.
Other research developed a novel verification scheme that was the
first to provide integrity guarantees for large files encoded with
rateless erasure codes. It uses a homomorphic hash function and
batching to verify downloaded blocks on-the-fly.
- Anonymity systems:
Designed and implemented Tarzan, a peer-to-peer
anonymous IP network layer that is strongly resistant to traffic
analysis. Tarzan helped protect against Sybil attacks through the
selection of neighbors in a constrained and verifiable
manner. Helped design Free
Haven, a system for distributed, anonymous, persistent data
storage, robust against powerful attacks trying to find and destroy
data.
|
|