12-01
Highly Available Byzantine Fault Tolerant Distributed Systems

Many distributed services are hosted at large, shared, geographically diverse data centers, and they use replication to achieve high availability despite the unreachability of an entire data center. Recent events show that non-crash faults occur in these services and may lead to long outages, for example, Amazon's S3 service was down for at least 7 hours recently due to a Byzantine fault in their servers. While Byzantine-Fault Tolerance (BFT) could be used to withstand these faults, current BFT protocols can become unavailable if a small fraction of their replicas are unreachable. This is because existing BFT protocols favor strong safety guarantees (consistency) over liveness (availability).

In this talk, I will present a novel BFT state machine replication protocol called Zeno that trades consistency for higher availability. In particular, Zeno replaces strong consistency (linearizability) with a weaker guarantee (eventual consistency): clients can temporarily miss each other's updates but when the network is stable the states from the individual partitions are merged by having the replicas agree on a total order for all requests. Evaluation of a prototype of Zeno shows that Zeno provides better availability than traditional BFT protocols.

Bio:
Atul Singh is a Researcher at the NEC Labs, Princeton. He received his PhD in Computer Science from Rice University and spent last two years visiting the Max Planck Institute for Software Systems (MPI-SWS), Saarbrucken, Germany. Before that, he spent two years visiting Intel Research Berkeley, working with the P2 group. His interests lie in the area of dependable distributed systems, overlay networks, declarative networking, and is currently focusing on exciting challenges emerging in the cloud computing arena.

Date and Time

Tuesday December 1, 2009 12:30pm - 1:30pm

Location

Computer Science 402

Event Type

Talk

Speaker

Atul Singh, from NEC Labs (Princeton)

Host

Michael Freedman

Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.

CS Talks Mailing List

12-01 Highly Available Byzantine Fault Tolerant Distributed Systems

12-01
Highly Available Byzantine Fault Tolerant Distributed Systems