Computer Laboratory

Technical reports

Minimizing latency of agreement protocols

Piotr Zieliński

June 2006, 239 pages

This technical report is based on a dissertation submitted September 2005 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Trinity Hall.

Abstract

Maintaining consistency of fault-tolerant distributed systems is notoriously difficult to achieve. It often requires non-trivial agreement abstractions, such as Consensus, Atomic Broadcast, or Atomic Commitment. This thesis investigates implementations of such abstractions in the asynchronous model, extended with unreliable failure detectors or eventual synchrony. The main objective is to develop protocols that minimize the number of communication steps required in failure-free scenarios but remain correct if failures occur. For several agreement problems and their numerous variants, this thesis presents such low-latency algorithms and lower-bound theorems proving their optimality.

The observation that many agreement protocols share the same round-based structure helps to cope with a large number of agreement problems in a uniform way. One of the main contributions of this thesis is “Optimistically Terminating Consensus” (OTC) – a new lightweight agreement abstraction that formalizes the notion of a round. It is used to provide simple modular solutions to a large variety of agreement problems, including Consensus, Atomic Commitment, and Interactive Consistency. The OTC abstraction tolerates malicious participants and has no latency overhead; agreement protocols constructed in the OTC framework require no more communication steps than their ad-hoc counterparts.

The attractiveness of this approach lies in the fact that the correctness of OTC algorithms can be tested automatically. A theory developed in this thesis allows us to quickly evaluate OTC algorithm candidates without the time-consuming examination of their entire state space. This technique is then used to scan the space of possible solutions in order to automatically discover new low-latency OTC algorithms. From these, one can now easily obtain new implementations of Consensus and similar agreement problems such as Atomic Commitment or Interactive Consistency.

Because of its continuous nature, Atomic Broadcast is considered separately from other agreement abstractions. I first show that no algorithm can guarantee a latency of less than three communication steps in all failure-free scenarios. Then, I present new Atomic Broadcast algorithms that achieve the two-step latency in some special cases, while still guaranteeing three steps for other failure-free scenarios. The special cases considered here are: Optimistic Atomic Broadcast, (Optimistic) Generic Broadcast, and closed-group Atomic Broadcast. For each of these, I present an appropriate algorithm and prove its latency to be optimal.

Full text

PDF (2.0 MB)

BibTeX record

@TechReport{UCAM-CL-TR-667,
  author =	 {Zieli{\'n}ski, Piotr},
  title = 	 {{Minimizing latency of agreement protocols}},
  year = 	 2006,
  month = 	 jun,
  url = 	 {http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-667.pdf},
  institution =  {University of Cambridge, Computer Laboratory},
  number = 	 {UCAM-CL-TR-667}
}