Chapter 22 Distributed transactions
Objectives
To extend the study of transaction processing systems to allow for a distributed implementation. To study concurrency control and commitment in a distributed context.
Points to emphasise
- We reiterate the special properties of distributed systems from Chapter 14. We need to consider independent failure modes and the absence of global time. We are not concerned with object replicas nor with inconsistency resulting from communication delay.
- The methods of concurrency control in which decisions on operation invocations are taken locally at each object (TSO and OCC) distribute naturally. A single manager is assumed to co-ordinate object invocations and commitment.
- At first sight it seems that we need a distributed deadlock detection algorithm to implement distributed 2PL. In practice a timeout mechanism and abort could be used instead.
- All the objects involved in a transaction must make the same decision with respect to commitment or abortion. This is more difficult to achieve in a distributed system where components may fail independently of each other at any time. The two-phase commit (2PC) protocol attempts to achieve this. Note that there is a single point of decision in the protocol. More complex protocols e.g. 3PC specify detailed recovery procedures which are discussed informally here.
- Atomic commitment is pessimistic and is contrary to the philosophy of OCC. A two-phase validation protocol is outlined for OCC during which shadow objects may be taken for the execution phase of other transactions.
Possible difficulties
As in Chapter 19, OCC is the most difficult to understand because it is inherently non-strict. The discussion should grow out of Chapters 15, 18, 19 in a natural way.
Teaching hints
- The approaches to distributing the various methods of concurrency control are outlined in the chapter. In each case, the students could be asked to fill in the detailed actions of the manager and participants. For example, ask the students to specify a transaction manager which uses distributed 2PL based on timeouts instead of an algorithm for distributed deadlock detection.
- Recall previous discussions on pessimistic and optimistic approaches to concurrency control. An atomic implementation of commitment is associated with pessimistic methods.
- Ask how atomic commitment is achieved in a centralised implementation.
What are the new problems in a distributed implementation. How can a single point of decision be enforced in the presence of failures? What must a node record in persistent memory in case of failure? How is this used in the recovery procedures after failure?
- Recall the validation and update procedures for OCC. What additional problems are introduced in a distributed implementation?