Tolerating Slowdowns in Replicated State Machines using Copilots: Pseudocode and Proof of Correctness
This technical report presents the pseudocode and a proof of correctness for Copilot replication, a new consensus protocol that achieves 1-slowdown-tolerance: it delivers normal latency despite the slowdown of any 1 replica. Copilot achieves this by using two distinguished replicas—the pilot and copilot—to proactively add redundancy to all stages of processing a client’s command. We give an overview of the Copilot protocol and then present its pseudocode. We also provide a proof of correctness in the asynchronous crash failure model, showing that: Copilot requires 2f + 1 replicas to tolerate at most f crash failures, while guaranteeing safety (linearizability) despite any number of failures, and liveness as long as a majority of replicas communicate in a timely manner. These guarantees are the same as those of Multi-Paxos and EPaxos, which Copilot draws inspiration from. However, Copilot is the only protocol that provides 1-slowdown-tolerance in the presence of any one slow replica.
For a detailed discussion of Copilot’s design, optimizations, and evaluation, please refer to the conference paper.