Computer Laboratory – Course material 2010–11: Concurrent and Distributed Systems II

Course material 2010–11

Concurrent and Distributed Systems II

Lecturer: Dr D. Evans

No. of lectures: 8

Prerequisite courses: Concurrent and Distributed Systems I, Operating Systems

These eight lectures, together with the eight in the Michaelmas Term, form a single course of 16 lectures.

Aims

The aims of this course are to study the fundamental characteristics of distributed systems, including their models and architectures; the implications for software design; some of the techniques that have been used to build them; and the resulting details of good distributed algorithms and applications.

Lectures on Distributed Systems (Lent Term)

Introduction, Evolution, Architecture. Fundamental properties. Evolution from LANs. Introduction to the need for naming, authentication, policy specification and enforcement. Examples of multi-domain systems. Why things can get difficult quickly. Enough Erlang to understand subsequent examples.
Time and event ordering. Time, clocks and event ordering. Earth time, computer clocks, clock drift, clock synchronisation. Order imposed by inter-process communication. Timestamps point/interval. Event composition; uncertainty of ordering, failure and delay.

Process groups: open/closed, structured/unstructured. Message delivery ordering: arrival order; causal order (vector clocks); total order. Physical causality from real-world examples.
Consistency and commitment. Strong and weak consistency. Replica management. Quorum assembly. Distributed transactions. Distributed concurrency control: two-phase locking, timestamp ordering. Atomic commitment; two-phase commit protocol. Distributed optimistic concurrency control and commitment.

Some algorithm outlines: Election of a leader. Distributed mutual exclusion.
Middleware. Synchronous: RPC, object-orientated. Asynchronous: message orientated, publish/subscribe, peer-to-peer. Event-based systems. Examples of some simple distributed programs in Java and Erlang.
Naming and name services. Unique identifiers, pure and impure names. Name spaces, naming domains, name resolution. Large scale name services: DNS, X.500/LDAP, GNS. Use of replication. Consistency-availability tradeoffs. Design assumptions and future issues.
Access control for multi-domain distributed systems. Requirements from healthcare, police, emergency services, globally distributed companies. ACLs, capabilities, Role-Based Access Control (RBAC). Context aware access control. Examples: OASIS, CBCL OASIS, Microsoft Healthvault, ... Authentication and authorisation: Raven, Shibboleth, OpenID.
Distributed storage services. Summary and roundup. Network-based storage services. Naming and access control. Peer-to-peer protocols. Content distribution. Summary and roundup. Open problems for future years: transactional main memory; multicore concurrency control; untrusted components. Byzantine failure.

Objectives

At the end of the course students should

understand the need for concurrency control in operating systems and applications, both mutual exclusion and condition synchronisation;
understand how multi-threading can be supported and the implications of different approaches;
be familiar with the support offered by various programming languages for concurrency control and be able to judge the scope, performance implications and possible applications of the various approaches;
be aware that dynamic resource allocation can lead to deadlock;
understand the concept of transaction; the properties of transactions, how concurrency control can be assured and how transactions can be distributed;
understand the fundamental properties of distributed systems and their implications for system design;
understand the effects of large scale on the provision of fundamental services and the tradeoffs arising from scale;
be familiar with a range of distributed algorithms.

Recommended reading

* Bacon, J. & Harris, T. (2003). Operating systems: distributed and concurrent software design. Addison-Wesley.
Bacon, J. (1997). Concurrent Systems. Addison-Wesley.
Tanenbaum, A.S. & van Steen, M. (2002). Distributed systems. Prentice Hall.
Coulouris, G.F., Dollimore, J.B. & Kindberg, T. (2005, 2001). Distributed systems, concepts and design. Addison-Wesley (4th, 3rd eds.).