Computer Laboratory

Course material 2010–11


Concurrent and Distributed Systems II

Lecturer: Dr D. Evans

No. of lectures: 8

Prerequisite courses: Concurrent and Distributed Systems I, Operating Systems

These eight lectures, together with the eight in the Michaelmas Term, form a single course of 16 lectures.

Aims

The aims of this course are to study the fundamental characteristics of distributed systems, including their models and architectures; the implications for software design; some of the techniques that have been used to build them; and the resulting details of good distributed algorithms and applications.

Lectures on Distributed Systems (Lent Term)

  • Introduction, Evolution, Architecture. Fundamental properties. Evolution from LANs. Introduction to the need for naming, authentication, policy specification and enforcement. Examples of multi-domain systems. Why things can get difficult quickly. Enough Erlang to understand subsequent examples.

  • Time and event ordering. Time, clocks and event ordering. Earth time, computer clocks, clock drift, clock synchronisation. Order imposed by inter-process communication. Timestamps point/interval. Event composition; uncertainty of ordering, failure and delay.

    Process groups: open/closed, structured/unstructured. Message delivery ordering: arrival order; causal order (vector clocks); total order. Physical causality from real-world examples.

  • Consistency and commitment. Strong and weak consistency. Replica management. Quorum assembly. Distributed transactions. Distributed concurrency control: two-phase locking, timestamp ordering. Atomic commitment; two-phase commit protocol. Distributed optimistic concurrency control and commitment.

    Some algorithm outlines: Election of a leader. Distributed mutual exclusion.

  • Middleware. Synchronous: RPC, object-orientated. Asynchronous: message orientated, publish/subscribe, peer-to-peer. Event-based systems. Examples of some simple distributed programs in Java and Erlang.

  • Naming and name services. Unique identifiers, pure and impure names. Name spaces, naming domains, name resolution. Large scale name services: DNS, X.500/LDAP, GNS. Use of replication. Consistency-availability tradeoffs. Design assumptions and future issues.

  • Access control for multi-domain distributed systems. Requirements from healthcare, police, emergency services, globally distributed companies. ACLs, capabilities, Role-Based Access Control (RBAC). Context aware access control. Examples: OASIS, CBCL OASIS, Microsoft Healthvault, ... Authentication and authorisation: Raven, Shibboleth, OpenID.

  • Distributed storage services. Summary and roundup. Network-based storage services. Naming and access control. Peer-to-peer protocols. Content distribution. Summary and roundup. Open problems for future years: transactional main memory; multicore concurrency control; untrusted components. Byzantine failure.

Objectives

At the end of the course students should

  • understand the need for concurrency control in operating systems and applications, both mutual exclusion and condition synchronisation;

  • understand how multi-threading can be supported and the implications of different approaches;

  • be familiar with the support offered by various programming languages for concurrency control and be able to judge the scope, performance implications and possible applications of the various approaches;

  • be aware that dynamic resource allocation can lead to deadlock;

  • understand the concept of transaction; the properties of transactions, how concurrency control can be assured and how transactions can be distributed;

  • understand the fundamental properties of distributed systems and their implications for system design;

  • understand the effects of large scale on the provision of fundamental services and the tradeoffs arising from scale;

  • be familiar with a range of distributed algorithms.

Recommended reading

* Bacon, J. & Harris, T. (2003). Operating systems: distributed and concurrent software design. Addison-Wesley.
Bacon, J. (1997). Concurrent Systems. Addison-Wesley.
Tanenbaum, A.S. & van Steen, M. (2002). Distributed systems. Prentice Hall.
Coulouris, G.F., Dollimore, J.B. & Kindberg, T. (2005, 2001). Distributed systems, concepts and design. Addison-Wesley (4th, 3rd eds.).