Distributed Operating Systems

Distributed Operating Systems extend the notion of a virtual machine over a number of interconnected computers or hosts. Note that the user/programmer still has the illusion of working on a single system. All the issues of concurrency and distribution are completely hidden by the virtual machine, and the user/programmer is not at liberty to exploit them (nor should they be hindered by them!). Distributed Operating Systems (#fnos3#108>) are often broadly classified into two extremes of a spectrum:

Loosely Coupled Systems. Components are Workstations, LAN, Servers. e.g. V-System, BSD Unix...
Tightly Coupled Systems. Components are Processors, Memory, Bus, I/O e.g. Meiko Compute Surface

Often this classification is really a reflection of the reliability and performance of the communications sub-system. Frequently, shared memory systems are regarded as more tightly coupled than message passing systems. Another way of looking at these classifications is to think of tightly coupled systems as being <#111#> dependent<#111#>, and loosely coupled systems as <#112#> independent<#112#>, where the dependency is in terms of system availability in the face of failure of some single host. In tightly coupled systems, it is reasonable to consider shared memory (or at least hierarchical cache mechanisms) as a communications mechanism. In loosely coupled systems, only message passing can be considered. Distributed systems have been around since the early 1970s. Examples of early Distributed Systems are tabulated in #tbeds#113>.

#table114#
Table: Early Distributed Systems

#table120#
Table: Recent Distributed Systems

These are discussed informally in the final chapter of this book. These can be categorised by whether they are:

Network Operating Systems e.g. 4.xBSD Unix These are conventional centralized operating systems with networking facilities added as operating system services, but distinct from other (i/o) services.
Distributed Operating Systems e.g. MACH These systems allow a distributed set of processors to appear as a single system.
Distributed Human Access e.g. OS + X Windows A set of systems running centralized operating system services are made to appear as a single system to the human user.
Distributed File Systems e.g. Unix + NFS Rather than providing a global human view of the systems, we provide the systems with a global view of storage, and therefore any programs too with the same view.
Distributed Processing Environments e.g. V-System/Amoeba The axis of distribution is the processor rather than terminal/window I/O or the storage system.

In practice, the most widespread systems are those combining distributed file access and distributed Human access: The workstation/fileserver/compute server model has evolved in the last ten years predominately due to costs of LAN access with enough performance to provide realistic remote disk/file access and memory/bitmap display costs dropping low enough to make window based software realistic on the desktop. Slowly, some more useful distributed tools are emerging: Since distributed systems have existed mainly in a Research and Development environment, there has been some work on tools to help with Software development in a distributed environment. These include:

Automated Software distribution (BSD rdist).
Shared views of data (single editor, multiple reviewers!), conferencing.
Generation of multiple executables for different architectures from a single Program Source development tree (including multiple source code revisions control trees).
Distributed Make facilities (ability to compile independent source files separately and automatically on multiple workstations).

Enslow's classic classification [REF] uses three axes of distribution:

Processors Any interesting distributed system has processing capability in more than one place.
Control The programs that make up the system have components in more than one processor. Another way of saying this is that the thread of control crosses more than one address space.
Data The data required for a given task is located at more than one place - perhaps replicated for reliability reasons, or partitioned for performance reasons.

If we apply these models to the systems above, we can see the choices made in the distribution of services. Distribution of services is rather different from distribution of the Operating System itself. In some systems [e.g. Sun NFS/Newcastle Connection] an operating system service [File access] has been distributed from within the operating system. In others [e.g. printing], the service has been distributed above (outside) the operating system.

#figure132#
Figure: Distributed Operating System Services

To provide this distribution, a number of communication mechanisms are required, and these are overviewed in the next section.