Course material 2010–11
Concurrent and Distributed Systems II
Lecturer: Dr D. Evans
No. of lectures: 8
Prerequisite courses: Concurrent and Distributed Systems I, Operating Systems
These eight lectures, together with the eight in the Michaelmas Term, form a single course of 16 lectures.
Aims
The aims of this course are to study the fundamental characteristics of distributed systems, including their models and architectures; the implications for software design; some of the techniques that have been used to build them; and the resulting details of good distributed algorithms and applications.
Lectures on Distributed Systems (Lent Term)
- Introduction, Evolution, Architecture.
Fundamental properties. Evolution from LANs. Introduction to the need
for naming, authentication, policy specification and enforcement.
Examples of multi-domain systems. Why things can get difficult
quickly. Enough Erlang to understand subsequent examples.
- Time and event ordering.
Time, clocks and event ordering.
Earth time, computer clocks, clock drift, clock synchronisation.
Order imposed by inter-process communication.
Timestamps point/interval.
Event composition; uncertainty of ordering, failure and delay.
Process groups: open/closed, structured/unstructured. Message delivery ordering: arrival order; causal order (vector clocks); total order. Physical causality from real-world examples.
- Consistency and commitment.
Strong and weak consistency. Replica management. Quorum assembly.
Distributed transactions. Distributed concurrency control:
two-phase locking, timestamp ordering.
Atomic commitment; two-phase commit protocol.
Distributed optimistic concurrency control and commitment.
Some algorithm outlines: Election of a leader. Distributed mutual exclusion.
- Middleware.
Synchronous: RPC, object-orientated. Asynchronous: message orientated,
publish/subscribe, peer-to-peer.
Event-based systems.
Examples of some simple distributed programs in Java and Erlang.
- Naming and name services.
Unique identifiers, pure and impure names.
Name spaces, naming domains, name resolution.
Large scale name services: DNS, X.500/LDAP, GNS.
Use of replication. Consistency-availability tradeoffs.
Design assumptions and future issues.
- Access control for multi-domain distributed systems.
Requirements from healthcare, police, emergency services, globally
distributed companies.
ACLs, capabilities, Role-Based Access Control (RBAC). Context aware
access control.
Examples: OASIS, CBCL OASIS, Microsoft Healthvault, ... Authentication and authorisation: Raven, Shibboleth, OpenID.
- Distributed storage services. Summary and roundup.
Network-based storage services. Naming and access control.
Peer-to-peer protocols. Content distribution.
Summary and roundup. Open problems for future years: transactional main
memory; multicore concurrency control; untrusted components. Byzantine
failure.
Objectives
At the end of the course students should
- understand the need for concurrency control in operating systems
and applications, both mutual exclusion and condition
synchronisation;
- understand how multi-threading can be supported and the
implications of different approaches;
- be familiar with the support offered by various programming
languages for concurrency control and be able to judge the scope,
performance implications and possible applications of the various
approaches;
- be aware that dynamic resource allocation can lead to deadlock;
- understand the concept of transaction; the properties of
transactions, how concurrency control can be assured and how
transactions can be distributed;
- understand the fundamental properties of distributed systems and
their implications for system design;
- understand the effects of large scale on the provision of
fundamental services and the tradeoffs arising from scale;
- be familiar with a range of distributed algorithms.
Recommended reading
* Bacon, J. & Harris, T. (2003). Operating systems: distributed and concurrent software design. Addison-Wesley.
Bacon, J. (1997). Concurrent Systems. Addison-Wesley.
Tanenbaum, A.S. & van Steen, M. (2002). Distributed systems. Prentice Hall.
Coulouris, G.F., Dollimore, J.B. & Kindberg, T. (2005, 2001). Distributed systems, concepts and design. Addison-Wesley (4th, 3rd eds.).