Tolerating Slowdowns in Replicated State Machines
Khiem Ngo,
Siddhartha Sen,
Wyatt Lloyd
Aug 31, 2021
Publications
Avicenna: Masking Slowdowns in Replicated State Machines with Counterfactual Evaluation
🏆 Best Paper Award
Geo-distributed replicated state machines (RSMs) are at the heart of many production distributed systems, offering linearizability and fault tolerance via consensus protocols. Avicenna is the first consensus protocol for geo-distributed RSMs that maintains low normal-case latency while tolerating a single fail-slow replica.
Christopher Hodsdon,
Zijian Qin,
Khiem Ngo,
Siddhartha Sen,
Ethan Katz-Bassett,
Wyatt Lloyd
Replicated state machines are linearizable, fault-tolerant groups of replicas that are coordinated using a consensus algorithm. Copilot replication is the first 1-slowdown-tolerant consensus protocol: it delivers normal latency despite the slowdown of any 1 replica.
Khiem Ngo,
Siddhartha Sen,
Wyatt Lloyd