Chapter 21 Recovery

Exercises

21-1 Consider a TP system crash, in which main memory is lost at the following times:

a) A transaction has completed some but not all of its operations.

b) A client has received acknowledgement that a transaction has committed.

c) The system has decided to commit a transaction and has recorded the fact on stable storage but there is a crash before the client can receive the acknowledgement of commit.

Discuss what the system must do on restart for each of these cases. Consider for each case where the data values that have been or are about to be committed might be stored. Why is it essential for the TP system to know? Why might a conventional operating system not make this information available?

a) undo all the invocations of the transaction. Note that the transaction id and the operations that have been carried out must be recorded in persistent store if any of the results have reached persistent store. If all information is in main memory its loss would make the situation as though the transaction had not taken place.

b) the result must persist whatever happens. Information must be recorded with redundancy in persistent store on commit.

c) If the decision point is passed then the transaction is committed. Information must be in persistent store before the decision to commit can be taken. We hope that recovery is fast enough for the client to be informed. The seat is booked for her but she hasn’t been told.

21-2 A recovery log is very large and it may be used for transaction abort as well as crash recovery. How can a periodic checkpoint procedure help to manage this complex process?

Checkpoints add structure to the log. After a checkpoint is taken we know that object updates and log records for the transactions noted in the checkpoint have reached persistent store. Crash recovery can start from the most recent checkpoint.

21-3 Why must undo and redo operations be idempotent?

A crash may take place while recovery is in progress. We do not know on restart whether or not we have undone or redone an operation.

21-4 Consider a TP system based on the bank account objects used as an example in Section 21.6 and shown in Figure 21.4.

For each operation define an undo operation.

credit: debit and vice versa

add-interest-to-balance: the inverse computation can be carried out

set-interest-rate: this is an overwrite and can only be undone if pre- and post-state is recorded.

Suppose that a recovery log record contains:

Transaction id, object id, operation, arguments, state prior to invocation.

Define recovery procedures for transactions in all the states shown in Figure 21.2 when a crash occurs. Consider the possibility of a crash during the recovery procedures.

The recovery algorithm given in Section 21.5 can be followed. The undo and redo operations must be made idempotent to allow for a crash during recovery. With information recorded as stated above an undo requires only that we set an object’s state to its value before the invocation. We have sufficient information to redo an operation.

21.5 Redesign the recovery algorithm of Section 21.8 for a non-strict execution schedule (where cascading aborts must be allowed for).