The programming model of Nemesis is a framework for describing how programs are structured; in a sense, it is how a programmer thinks about an application. In particular, it is concerned with how components of a program or subsystem interact with one another.
The goal of the programming model is to reduce complexity for the programmer. This is particularly important in Nemesis where applications tend to be more complex as a consequence of the architecture. The model is independent of programming language or machine representation, though its form has been strongly influenced by the model of linkage to be presented in section 2.3.
In systems, complexity is typically managed by the use of modularity: decomposing a complex system into a set of components which interact across well-defined interfaces. In software systems, the interfaces are often instances of abstract data types (ADTs), consisting of a set of operations which manipulate some hidden state. This approach is used in Nemesis.
Although the model is independent of representation, it is often convenient to describe it in terms of the two main languages used in the implementation of Nemesis: the interface definition language MIDDL and a stylised version of the programming language C.
Nemesis, like Spring, is unusual among operating systems in that all interfaces are strongly typed, and these types are defined in an interface definition language. It is clearly important, therefore, to start with a good type system, and [Evers 93] presents a good discussion of the issues of typing in a systems environment. As in many RPC systems, the type system used in Nemesis is a hybrid: it includes notions both of the abstract types of interfaces and of concrete data types. It represents a compromise between the conceptual elegance and software engineering benefits of purely abstract type systems such as that used in Emerald [Raj 91], and the requirements of efficiency and inter-operability: the goal is to implement an operating system with few restrictions on programming language.
Concrete types are data types whose structure is explicit. They can be predefined (such as booleans, strings, and integers of various sizes) or constructed (as with records, arrays, etc). The space of concrete types also includes typed references to interfaces.
Interfaces are instances of ADTs. Interfaces are rarely static: they can be dynamically created and references to them passed around freely. The type system includes a simple concept of subtyping. An interface type can be a subtype of another ADT, in which case it supports all the operations of the supertype, and an instance of the subtype can be used where an instance of the supertype is required.
The operations supported by interfaces are like procedure calls: they take a number of arguments and normally return a number of results. They can also raise exceptions, which themselves can take arguments. Exceptions in Nemesis behave in a similar way to those in Modula-3 [Nelson 91].
Interface types are defined in an IDL called MIDDL [Roscoe 94b]. MIDDL is similar in functionality to the IDL s used in object-based RPC systems, with some additional constructs to handle local and low-level operating system interfaces. A MIDDL specification defines a single ADT by declaring its supertype, if any, and giving the signatures of all the operations it supports. A specification can also include declarations of exceptions, and concrete types. Figure 2.1 shows a typical interface specification, the (slightly simplified) definition of the Context interface type.
Figure 2.1: MIDDL specification of the Context interface type
The word object in Nemesis denotes what lies behind an interface: an object consists of state and code to implement the operations of the one or more interfaces it provides. A class is a set of objects which share the same underlying implementation, and the idea of object class is distinct from that of type, which is a property of interfaces rather than objects.
This definition of an object as hidden state and typed interfaces may be contrasted with the use of the term in some object-oriented programming languages like C++ [Stroustrup 91]. In C++ there is no distinction between class and type, and hence no clear notion of an interface. The type of an interface is always purely abstract: it says nothing about the implementation of any object which exports it. It is normal to have a number of different implementations of the same type.
When an operation is invoked upon an object across one of its interfaces, the environment in which the operation is performed depends only on the internal state of the object and the arguments of the invocation. There are no global symbols in the programming model. Apart from the benefits of encapsulation this provides, it facilitates the sharing of code described in section 2.3.
An object is created by an invocation on an interface, which returns a set of references to the interfaces exported by the new object. As in Emerald, constructors are the basic instantiation mechanism rather than classes. By removing the artificial distinction between objects and the means used to create them, creation of interfaces in the operating system can be more flexible than the `institutionalised' mechanisms of language runtime systems. This is particularly important in the lower levels of an operating system, where a language runtime is not available.
The programming model described so far enforces strict encapsulation of objects: the environment in which an interface operation executes is determined entirely by the operation arguments and the object state. Unfortunately, there are cases where this is too restrictive from a practical point of view. Certain interfaces provided by the operating and runtime systems are used so pervasively by application code that it is more natural to treat them as part of the thread context than the state of some object. These include:
However, passing references to all these interfaces as parameters to every operation is ugly and complicates code. The references could be stored as part of the object state, but this still requires that they be passed as arguments to object constructors, and complicates the implementation of objects which would otherwise have no mutable state (and could therefore be shared among domains as is).
Pervasive interfaces are therefore viewed as part of the context of the currently executing thread. As such they are always available, and are carried across an interface when an invocation is made. This view has a number of advantages:
The programming model has to address the problem of memory allocation. An invocation across an interface can cause the creation of a concrete structure which occupies an area of memory. There needs to be a convention for determining:
Unfortunately, Nemesis does not provide a central garbage collector and a domain typically has a variety of pools to allocate memory from, each corresponding to an interface of type Heap (multiple heaps are used to allocate shared memory from areas with different access permissions). Moreover, it is desirable to preserve a degree of communication transparency: wherever possible, a programmer should not need to know whether a particular interface is exported by an object local to the domain or is a surrogate for a remote one.
Network-based RPC systems without garbage collection use conventions to decide when the RPC runtime has allocated memory for unmarshalling large or variable-sized parameters. Usually this memory is allocated by the language heap, although some RPC systems have allowed callers to specify different heaps at bind time (for example, [Roscoe 94c]). To preserve transparency, in all cases the receiver of the data is responsible for freeing it. This ensures that the application code need not be aware of whether a local object or the RPC run time system has allocated memory.
In systems where caller and object are in different protection domains but share areas of memory, the situation is complicated because of the desire to avoid unnecessary memory allocation and data copies. Ideally, the conventions used should accommodate both the cases where the caller allocates space for the results in advance, and where the callee allocates space on demand from caller memory during the invocation.
Nemesis uses parameter passing modes to indicate memory allocation policy: each parameter in a MIDDL operation signature has an associated mode, which is one of the following:
The OUT mode allows results to be written by a local object into space already allocated by the client (in the stack frame, for example). In the remote case, it is more efficient than the IN OUT mode because the value does not need to be transmitted to the server; it is only returned.
These modes are all implemented on the Alpha processor using call by reference, except RESULT, which returns a pointer to the new storage. For values small enough to fit into a machine word, IN is coded as call by value and RESULT returns the value itself rather than a reference to it.
These conventions have been found to cover almost all cases encountered in practice. As a last resort, MIDDL possesses a REF type constructor which allows pointers to values of a particular type to be passed explicitly.
The linkage model concerns the data structures used to link program components, and their interpretation at runtime. An early version of the linkage mechanism was described in [Roscoe 94a]. Its goal is twofold:
A stub compiler is used to map MIDDL type definitions to C language types. The compiler, known as middlc processes an interface specification and generates a header file giving C type declarations for the concrete types defined in the interface together with special types used to represent instances of the interface.
An interface is represented in memory as a closure: a record of two pointers, one to an array of function pointers and one to a state record (figure 2.2).
Figure 2.2: Context interface closure
To invoke an operation on an interface, the client calls through the appropriate element of the operation table, passing as first argument the address of the closure itself. The middlc compiler generates appropriate C data types so that an operation can be coded as, for example: b = ctxt->op->Get( ctxt, "modules>DomainMgr", &dma );
In this case, ctxt is the interface reference. middlc\ generates C preprocessor macros so one may use the CLU-like syntax: b = Context$Get( ctxt, "modules>DomainMgr", &dma );
A Nemesis module is a unit of loadable code, analogous to an object file in an UNIX system. All the code in Nemesis exists in one module or another. These modules are quite small, typically about 10 kilobytes of text and about the same of constant data. The use of constructor interfaces for objects rather than explicit class structures makes it natural to write a module for each kind of object, containing code to implement both the object's interfaces and its constructors. Such modules are similar to CLU clusters [Liskov 81], though `own' variables are not permitted.
Modules are created by running the UNIX ld linker on object files. The result is a file which has no unresolved references, a few externally visible references, and no uninitialised or writable data.
All linkage between modules is performed via pointers to closures. A module will export one or more fixed closures (for example, the constructors) as externally visible symbols, and the system loader installs these in a name space (see section 2.4) when the module is loaded. To use the code in a module, an application must locate an interface for the module, often by name lookup. In this sense linking modules is entirely dynamic.
If a domain wishes to create an object with mutable state, it must invoke an operation on an existing interface which returns an interface reference of the required type and class.
Figure 2.3: Module and instantiated object
Figure 2.3 shows an example where a domain has instantiated a naming context by calling the New operation of an interface of type ContextMod. The latter is implemented by a module with no mutable state, and has instantiated an object with two interfaces, of types Context and Debugging. The module has returned pointers to these in the results c and d. The state of the object includes a heap interface reference, passed as a parameter to the constructor and closed over.
The use of interfaces and modules in Nemesis permits a model where all text and data occupies a single address space, since there is no need for data or text to be at well-known addresses in each domain. The increasing use of 64-bit processors with very large virtual address spaces (the Alpha processor on which Nemesis runs implements 43 bits of a 64-bit architectural addressing range) makes the issue of allocating single addresses to each object in the system relatively easy.
It must be emphasised that this in no way implies a lack of memory protection between domains. The virtual address translations in Nemesis are the same for all domains, while the protection rights on a given page may vary. Virtual address space in Nemesis is divided into segments (sometimes called stretches) which have access control lists associated with them. What it does mean is that any area of memory in Nemesis can be shared, and addresses of memory locations do not change between domains.
While simple addresses in the single address space suffice to identify any interface (or other data value) in the system, a more structured system of naming is also required.
The name space in Nemesis is completely independent of the rest of the operating system. While some operating system components do implement part of the name space, most naming contexts are first-class objects: they can be created at will and are capable of naming any value which has a MIDDL type.
There are few restrictions on how the name space is structured. The model followed is that of [Saltzer 79]: a name is a textual string, a binding is an association of a name with some value, and a context is a collection of bindings. Resolving a name is the process of locating the value bound to it. Name resolution requires that a context be specified.
Naming contexts are represented by interfaces which conform to the type Context. Operations are provided to bind a name to any value, resolve the name in the context and delete a binding from the context. The values bound in a context can be of arbitrary type, in particular they can be references to other interfaces of type Context. Naming graphs can be constructed in this way, and a pathname may be presented to a context in place of a simple name. A pathname consists of a sequence of names separated by `>` distinguished characters. To resolve such a pathname, the context object examines the first component of the name. If this name resolves to a context, this second context is invoked to resolve the remainder of the pathname.
The MergedContext interface type is a subtype of Context, modelled after a similar facility in Spring [Radia 93]. An instance of MergedContext represents a composition of naming contexts; when the merge is searched, each component context is queried in turn to try and resolve the first element of the name. Operations are provided to add and remove contexts from the merge.
Figure 2.4: Example name space
Figure 2.4 illustrates part of a naming graph created by the Nemesis system at boot time. Context A is the first to be created. Since one must always specify a context in which to resolve a name, there is no distinguished root. However A serves as a root for the kernel by convention. Context B holds local interfaces created by the system, thus `Services>DomainMgr' is a name for the Domain Manager service, relative to context A. Any closures exported by loaded modules are stored in context C (`Modules'), and are used during domain bootstrapping.
Context D has two names relative to the root, `Services>TypeSystem' and `Modules>TypeSystem'. This context is not in fact implemented in the usual way, but is part of the runtime type system, described in the next section.
The Type System is a system service which adds a primitive form of dynamic typing, similar to [Rovner 85]. Each MIDDL\ type is assigned a unique Type.Code, and the Type.Any type provides support for data values whose type is not known at compile time. The TypeSystem interface provides the operations IsType, to determine whether a Type.Any conforms to a particular type, and Narrow, which converts a Type.Any to a specified type if the type equivalence rules permit. A major use of Type.Any is in the naming interfaces: values of this type are bound to names in the name space.
The Type System data structures are accessible at run time through a series of interfaces whose types are subtypes of Context. For example, an operation within an interface is represented by an interface of type Operation, whose naming context includes all the parameters of the operation. Every MIDDL interface type is completely represented in this way.
A good example of how the programming and linkage models work well in practice is CLANGER [Roscoe 95a] , a novel interpreted language for operating systems. CLANGER relies on the following three system features:
In CLANGER a variable name is simply a pathname relative to a naming context specified when the interpreter was instantiated. All values in the language are represented as Type.Anys. The language allows operations to be invoked on variables which are interface references by synthesising C call frames. Access to the Type System allows the interpreter to type-check and narrow the arguments to the invocation, and select appropriate types for the return values.
The invocation feature means that the language can be fully general without a complex interpreter or the need to write interface `wrappers' in a compiled language. This capability was previously only available in development systems such as Oberon [Gutknecht ] and not in a general-purpose, protected operating system. CLANGER can be used for prototyping, debugging, embedded control, operating system configuration and as a general purpose programmable command shell.
The business of starting up a new domain in Nemesis is of interest, partly because the system is very different from UNIX and partly because it gives an example of the use of a single address space to simplify some programming problems.
The traditional UNIX fork primitive is not available in Nemesis. The state of a running domain consists of a large number of objects scattered around memory, many of which are specific to the domain. Duplicating this information for a child domain is not possible, and would create much confusion even if it were, particularly for domains with communication channels to the parent. In any case, fork is rarely used for producing an exact duplicate of the parent process, rather it is a convenient way of bootstrapping a process largely by copying the relevant data structures. In Nemesis, as in other systems without fork such as VMS, this can be achieved by other means.
The kernel's view of a domain is limited to a single data structure called the Domain Control Block, or DCB. This contains scheduling information, communication end-points, a protection domain identifier, an upcall entry point for the domain, and a small initial stack. The DCB is divided into two areas. One is writable by the domain itself, the other is readable but not writable. A privileged service called the Domain Manager creates DCBs and links them into the scheduler data structures.
The arguments to the Domain Manager are a set of QoS parameters for the new domain, together with a single closure pointer of type DomainEntryPoint. This closure provides the initial entry point to the domain in its sole operation (called Go), and the state record should contain everything the new domain needs to get going.
The creation of this DCB is the only involvement the operating system proper has in the process. Everything else is performed by the two domains involved: the parent creates the initial closure for the domain, and the child on startup locates all the necessary services it needs which have not been provided by the parent. Figure 2.5 shows the process.
Figure 2.5: Creation of a new domain
The DomainEntryPoint closure is the equivalent of main in UNIX, with the state record taking the place of the command line arguments. By convention the calling domain creates the minimum necessary state, namely:
To save a programmer the tedium of writing both sides of the closure initialisation code, a module is provided called the Builder. The Builder takes a ThreadClosure which represents the initial thread of a multi-threaded domain to be created. The Builder instantiates an initial heap for the new domain. Most heap implementations in Nemesis use a single area of storage for both their internal state and the memory they allocate, so the parent domain can create a heap, allocate initial structures for the child within it, and then hand it over in its entirety to the new domain.
The Builder returns a DomainEntryPoint closure which can be passed to the Domain Manager. When started up, the new domain executes code within the Builder module which carries out conventional initialisation procedures, including instantiating a threads package. The main thread entry point is also Builder code, creating the remaining state before entering the thread procedure originally specified. Figure 2.6 shows the sequence of events.
Figure 2.6: Action of the Builder
This illustrates two situations where a module executes in two domains. In the first, part of the Builder executes the parent and another part executes in the child. In the second, a single heap object is instantiated and executes in the parent, and is then handed off to the child, which continues to invoke operations upon it. In both cases the programmer need not be aware of this, and instantiating a new domain with a fully functional runtime system is a fairly painless operation.