Course pages 2012–13
- Aims of the Michaelmas Term part of the course
- Michaelmas Term Lectures (Concurrency)
- Objectives
- Recommended reading
- Aims of the Lent Term part of the course
- Lent Term Lectures (Distributed Systems)
- Objectives
- Recommended reading
Concurrent and Distributed Systems
Lecturer: Dr R.M. Watson and Dr A. Madhavapeddy
No. of lectures: 16 (Continued in Lent Term)
Suggested hours of supervisions: 4
Prerequisite courses: Operating Systems, Programming in Java
This course is a pre-requisite for Mobile and Sensor Systems (Part II).
Aims of the Michaelmas Term part of the course
The aim of the course is to introduce concurrency control and distribution concepts and their implications for system design and implementation.
Michaelmas Term Lectures (Concurrency)
- Introduction; thread models.
Overview of properties of distributed and concurrent systems. Software
system structure. Occurrence of concurrency in systems. Recap of scheduling
and pre-emption. Thread models.
- Classical concurrency control.
Shared data and critical regions. Mutual exclusion and condition
synchronisation. Semaphores. Implementation of concurrency control.
- Classical problems using semaphores.
Bounded cyclic buffer (producer(s) and consumer(s)), multiple readers and
writers. Problems arising in semaphore programming.
- Concurrency support in programming languages.
Shared data: monitors, pthreads, Java. No shared data: occam, Ada active
objects, Erlang, Kilim, tuple spaces.
Lock-free programming.
- Concurrent composite operations.
Composite operations in main memory and persistent memory. Dynamic
resources allocation and deadlock. Dining philosophers program. Deadlock
detection and avoidance.
- Transactions.
ACID properties. Concurrency control and crash recovery. Definition of
conflicting operations. Serialisation. Cascading aborts.
- Database concurrency control.
Pessimistic concurrency control: two-phase locking, timestamp ordering.
Optimistic concurrency control.
- Database recovery and summary of “Concurrency”.
Write ahead log, undo/redo. Points to take forward.
Objectives
At the end of the course students should
- understand the need for concurrency control in operating systems and
applications, both mutual exclusion and condition synchronisation;
- understand how multi-threading can be supported and the implications of
different approaches;
- be familiar with the support offered by various programming languages for
concurrency control and be able to judge the scope, performance
implications and possible applications of the various approaches;
- be aware that dynamic resource allocation can lead to deadlock
- understand the concept of transaction; the properties of transactions,
how concurrency control can be assured and how transactions can be
distributed;
- understand the fundamental properties of distributed systems and their
implications for system design;
- understand the effects of large scale on the provision of fundamental
services and the tradeoffs arising from scale;
- be familiar with a range of distributed algorithms.
Recommended reading
* Bacon, J. & Harris, T. (2003). Operating systems: distributed and concurrent software design. Addison-Wesley.
Bacon, J. (1997). Concurrent Systems. Addison-Wesley.
Tanenbaum, A.S. & van Steen, M. (2002). Distributed systems. Prentice Hall.
Coulouris, G.F., Dollimore, J.B. & Kindberg, T. (2005, 2001). Distributed systems, concepts and design. Addison-Wesley (4th, 3rd eds.).
Aims of the Lent Term part of the course
The aims of this course are to study the fundamental characteristics of distributed systems, including their models and architectures; the implications for software design; some of the techniques that have been used to build them; and the resulting details of good distributed algorithms and applications.
Lent Term Lectures (Distributed Systems)
- Introduction, Evolution, Architecture.
Fundamental properties. Evolution from LANs. Introduction to the need
for naming, authentication, policy specification and enforcement.
Examples of multi-domain systems. Why things can get difficult
quickly. Enough Erlang to understand subsequent examples.
- Time and event ordering.
Time, clocks and event ordering.
Earth time, computer clocks, clock drift, clock synchronisation.
Order imposed by inter-process communication.
Timestamps point/interval.
Event composition; uncertainty of ordering, failure and delay.
Process groups: open/closed, structured/unstructured. Message delivery ordering: arrival order; causal order (vector clocks); total order. Physical causality from real-world examples.
- Consistency and commitment.
Strong and weak consistency. Replica management. Quorum assembly.
Distributed transactions. Distributed concurrency control:
two-phase locking, timestamp ordering.
Atomic commitment; two-phase commit protocol.
Distributed optimistic concurrency control and commitment.
Some algorithm outlines: Election of a leader. Distributed mutual exclusion.
- Middleware.
Synchronous: RPC, object-orientated. Asynchronous: message orientated,
publish/subscribe, peer-to-peer.
Event-based systems.
Examples of some simple distributed programs in Java and Erlang.
- Naming and name services.
Unique identifiers, pure and impure names.
Name spaces, naming domains, name resolution.
Large scale name services: DNS, X.500/LDAP, GNS.
Use of replication. Consistency-availability tradeoffs.
Design assumptions and future issues.
- Access control for multi-domain distributed systems.
Requirements from healthcare, police, emergency services, globally
distributed companies.
ACLs, capabilities, Role-Based Access Control (RBAC). Context aware
access control.
Examples: OASIS, CBCL OASIS, Microsoft Healthvault, ... Authentication and authorisation: Raven, Shibboleth, OpenID.
- Distributed storage services. Summary and roundup.
Network-based storage services. Naming and access control.
Peer-to-peer protocols. Content distribution.
Summary and roundup. Open problems for future years: transactional main
memory; multicore concurrency control; untrusted components. Byzantine
failure.
Objectives
At the end of the course students should
- understand the need for concurrency control in operating systems
and applications, both mutual exclusion and condition
synchronisation;
- understand how multi-threading can be supported and the
implications of different approaches;
- be familiar with the support offered by various programming
languages for concurrency control and be able to judge the scope,
performance implications and possible applications of the various
approaches;
- be aware that dynamic resource allocation can lead to deadlock;
- understand the concept of transaction; the properties of
transactions, how concurrency control can be assured and how
transactions can be distributed;
- understand the fundamental properties of distributed systems and
their implications for system design;
- understand the effects of large scale on the provision of
fundamental services and the tradeoffs arising from scale;
- be familiar with a range of distributed algorithms.
Recommended reading
* Bacon, J. & Harris, T. (2003). Operating systems: distributed and concurrent software design. Addison-Wesley.
Bacon, J. (1997). Concurrent Systems. Addison-Wesley.
Tanenbaum, A.S. & van Steen, M. (2002). Distributed systems. Prentice Hall.
Coulouris, G.F., Dollimore, J.B. & Kindberg, T. (2005, 2001). Distributed systems, concepts and design. Addison-Wesley (4th, 3rd eds.).