Department of Computer Science and Technology

Technical reports

Distributed consensus revised

Heidi Howard

April 2019, 151 pages

This technical report is based on a dissertation submitted September 2018 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Pembroke College.

DOI: 10.48456/tr-935

Abstract

We depend upon distributed systems in every aspect of life. Distributed consensus, the ability to reach agreement in the face of failures and asynchrony, is a fundamental and powerful primitive for constructing reliable distributed systems from unreliable components.

For over two decades, the Paxos algorithm has been synonymous with distributed consensus. Paxos is widely deployed in production systems, yet it is poorly understood and it proves to be heavyweight, unscalable and unreliable in practice. As such, Paxos has been the subject of extensive research to better understand the algorithm, to optimise its performance and to mitigate its limitations.

In this thesis, we re-examine the foundations of how Paxos solves distributed consensus. Our hypothesis is that these limitations are not inherent to the problem of consensus but instead specific to the approach of Paxos. The surprising result of our analysis is a substantial weakening to the requirements of this widely studied algorithm. Building on this insight, we are able to prove an extensive generalisation over the Paxos algorithm.

Our revised understanding of distributed consensus enables us to construct a diverse family of algorithms for solving consensus; covering classical as well as novel algorithms to reach consensus where it was previously thought impossible. We will explore the wide reaching implications of this new understanding, ranging from pragmatic optimisations to production systems to fundamentally novel approaches to consensus, which achieve new tradeoffs in performance, scalability and reliability.

Full text

PDF (1.2 MB)

BibTeX record

@TechReport{UCAM-CL-TR-935,
  author =	 {Howard, Heidi},
  title = 	 {{Distributed consensus revised}},
  year = 	 2019,
  month = 	 apr,
  url = 	 {https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-935.pdf},
  institution =  {University of Cambridge, Computer Laboratory},
  doi = 	 {10.48456/tr-935},
  number = 	 {UCAM-CL-TR-935}
}