Dynamic Rebinding for Marshalling and Update, via Redex-time and Destruct-time Reduction. Peter Sewell, Gareth Stoyle, Michael Hicks, Gavin Bierman, and Keith Wansbrough. Journal of Functional Programming, 18(4):437--502, 2008. [ bib | doi | pdf ]
Most programming languages adopt static binding, but for distributed programming an exclusive reliance on static binding is too restrictive: dynamic binding is required in various guises, for example, when a marshalled value is received from the network, containing identifiers that must be rebound to local resources. Typically, it is provided only by ad hoc mechanisms that lack clean semantics. In this paper, we adopt a foundational approach, developing core dynamic rebinding mechanisms as extensions to the simply typed call-by-value lambda calculus. To do so, we must first explore refinements of the call-by-value reduction strategy that delay instantiation, to ensure computations make use of the most recent versions of rebound definitions. We introduce redex-time and destruct-time strategies. The latter forms the basis for a lambda-marsh calculus that supports dynamic rebinding of marshalled values, while remaining as far as possible statically typed. We sketch an extension of lambda-marsh with concurrency and communication, giving examples showing how wrappers for encapsulating untrusted code can be expressed. Finally, we show that a high-level semantics for dynamic updating can also be based on the destruct-time strategy, defining a lambda-update calculus with simple primitives to provide type-safe updating of running code. We show how the ideas of this simple calculus extend to more real-world, module-level dynamic updating in the style of Erlang. We thereby establish primitives and a common semantic foundation for a variety of real-world dynamic rebinding requirements.
 
Acute: High-level programming language design for distributed computation. Peter Sewell, James J. Leifer, Keith Wansbrough, Francesco Zappa Nardelli, Mair Allen-Williams, Pierre Habouzit, and Viktor Vafeiadis. Journal of Functional Programming, 17(4--5):547--612, July 2007. Invited submission for an ICFP 2005 special issue. [ bib | doi | project page | ps | pdf | http ]
Existing languages provide good support for typeful programming of stand-alone programs. In a distributed system, however, there may be interaction between multiple instances of many distinct programs, sharing some (but not necessarily all) of their module structure, and with some instances rebuilt with new versions of certain modules as time goes on. In this paper, we discuss programming-language support for such systems, focussing on their typing and naming issues. We describe an experimental language, Acute, which extends an ML core to support distributed development, deployment, and execution, allowing type-safe interaction between separately built programs. The main features are (1) type-safe marshalling of arbitrary values; (2) type names that are generated (freshly and by hashing) to ensure that type equality tests suffice to protect the invariants of abstract types, across the entire distributed system; (3) expression-level names generated to ensure that name equality tests suffice for type safety of associated values, for example, values carried on named channels; (4) controlled dynamic rebinding of marshalled values to local resources; and (5) thunkification of threads and mutexes to support computation mobility. These features are a large part of what is needed for typeful distributed programming. They are a relatively lightweight extension of ML, should be efficiently implementable, and are expressive enough to enable a wide variety of distributed infrastructure layers to be written as simple library code above the byte-string network and persistent store APIs. This disentangles the language run-time from communication intricacies. This paper highlights the main design choices in Acute. It is supported by a full language definition (of typing, compilation, and operational semantics), by a prototype implementation, and by example distribution libraries.
 
Type-Safe Distributed Programming for OCaml. John Billings, Peter Sewell, Mark Shinwell, and Rok Strniša. In ML 2006. [ bib | doi | project page | ps | pdf | http ]
Existing ML-like languages guarantee type-safety, ensuring memory safety and protecting the invariants of abstract types, but only within single executions of single programs. Distributed programming is becoming ever more important, and should benefit even more from such guarantees. In previous work on theoretical calculi and the Acute prototype language we outlined techniques to provide them for simple languages.

In this paper we put these ideas into practice, describing the HashCaml extension to the OCaml bytecode compiler, which supports type-safe and abstraction-safe marshalling, together with related naming constructs. Our contribution is threefold: (1) We show how to define globally meaningful runtime type names for key OCaml type constructs that were not covered in our previous work, dealing with the generativity issues involved: user-defined variant and record types, substructures, functors, arbitrary ascription,separate compilation, and external C functions. (2) We support marshalling within polymorphic functions by type-passing, requiring us to build compositional runtime type names and revisit the OCaml relaxed value restriction. We show that with typed marshalling one must fall back to the SML97 value restriction. (3) We show how the above can be implemented with reasonable performance as an unintrusive modification to the existing OCaml language, implementation, and standard libraries. An alpha release of HashCaml, capable of bootstrapping itself, is available, along with an example type-safe distributed communication library written in the language.

 
HashCaml release, version 3.09.1-alpha-795, John Billings, Peter Sewell, Mark Shinwell, and Rok Strniša, April 2006. [ bib | project page | .html ]
 
Acute: High-level programming language design for distributed computation. Peter Sewell, James J. Leifer, Keith Wansbrough, Francesco Zappa Nardelli, Mair Allen-Williams, Pierre Habouzit, and Viktor Vafeiadis. In ICFP 2005. [ bib | doi | project page | ps | pdf | http ]
Existing languages provide good support for typeful programming of standalone programs. In a distributed system, however, there may be interaction between multiple instances of many distinct pro-grams, sharing some (but not necessarily all) of their module structure, and with some instances rebuilt with new versions of certain modules as time goes on. In this paper we discuss programming language support for su ch systems, focussing on their typing and naming issues.

We describe an experimental language, Acute, which extends an ML core to support distributed development, deployment, and execution, allowing type-safe interaction between separately-built programs. The main features are: (1) type-safe marshalling of arbitrary values; (2) type names that are generated (freshly and by hashing) to ensure that type equality tests suffice to protect the invariants of abstract types, across the entire distributed system; (3) expression-level names generated to ensure that name equality tests suffice for type-safety of associated values, e.g. values carried on named channels; (4) controlled dynamic rebinding of marshalled values to local resources; and (5) thunkification of threads and mutexes to support computation mobility.

These features are a large part of what is needed for typeful distributed programming. They are a relatively lightweight extension of ML, should be efficiently implementable, and are expressive enough to enable a wide variety of distributed infrastructure layers to be written as simple library code above the byte-string network and persistent store APIs. This disentangles the language runtime from communication intricacies. This paper highlights the main design choices in Acute. It is supported by a full language definition (of typing, compilation, and operational semantics), by a prototype implementation, and by example distribution libraries.

 
Acute release, Peter Sewell, James J. Leifer, Keith Wansbrough, Mair Allen-Williams, Francesco Zappa Nardelli, Pierre Habouzit, and Viktor Vafeiadis, January 2005. [ bib | project page | http ]
 
Acute: High-level programming language design for distributed computation. Design rationale and language definition. Peter Sewell, James J. Leifer, Keith Wansbrough, Mair Allen-Williams, Francesco Zappa Nardelli, Pierre Habouzit, and Viktor Vafeiadis. Technical Report UCAM-CL-TR-605, University of Cambridge Computer Laboratory, October 2004. Also published as INRIA RR-5329. 193pp. [ bib | project page | ps | pdf | .html ]
This paper studies key issues for distributed programming in high-level languages. We discuss the design space and describe an experimental language, Acute, which we have defined and implemented.

Acute extends an OCaml core to support distributed development, deployment, and execution, allowing type-safe interaction between separately-built programs. It is expressive enough to enable a wide variety of distributed infrastructure layers to be written as simple library code above the byte-string network and persistent store APIs, disentangling the language runtime from communication.

This requires a synthesis of novel and existing features: (1) type-safe marshalling of values between programs; (2) dynamic loading and controlled rebinding to local resources; (3) modules and abstract types with abstraction boundaries that are respected by interaction; (4) global names, generated either freshly or based on module hashes: at the type level, as runtime names for abstract types; and at the term level, as channel names and other interaction handles; (5) versions and version constraints, integrated with type identity; (6) local concurrency and thread thunkification; and (7) second-order polymorphism with a namecase construct.

We deal with the interplay among these features and the core, and develop a semantic definition that tracks abstraction boundaries, global names, and hashes throughout compilation and execution, but which still admits an efficient implementation strategy.

 
Acute and TCP: specifying and developing abstractions for global computation. James Leifer, Michael Norrish, Peter Sewell, and Keith Wansbrough. In APPSEM II Workshop, 2004, 2pp. [ bib ]
 
Dynamic Rebinding for Marshalling and Update, with Destruct-time λ. Gavin Bierman, Michael Hicks, Peter Sewell, Gareth Stoyle, and Keith Wansbrough. Technical Report UCAM-CL-TR-568, University of Cambridge Computer Laboratory, February 2004. 85pp. [ bib | ps | pdf | .html ]
Most programming languages adopt static binding, but for distributed programming an exclusive reliance on static binding is too restrictive: dynamic binding is required in various guises, for example when a marshalled value is received from the network, containing identifiers that must be rebound to local resources. Typically it is provided only by ad-hoc mechanisms that lack clean semantics.

In this paper we adopt a foundational approach, developing core dynamic rebinding mechanisms as extensions to the simply-typed call-by-value λ-calculus. To do so we must first explore refinements of the call-by-value reduction strategy that delay instantiation, to ensure computations make use of the most recent versions of rebound definitions. We introduce redex-time and destruct-time strategies. The latter forms the basis for a λ-marsh calculus that supports dynamic rebinding of marshalled values, while remaining as far as possible statically-typed. We sketch an extension of λ-marsh with concurrency and communication, giving examples showing how wrappers for encapsulating untrusted code can be expressed. Finally, we show that a high-level semantics for dynamic updating can also be based on the destruct-time strategy, defining a λ-update calculus with simple primitives to provide type-safe updating of running code. We thereby establish primitives and a common semantic foundation for a variety of real-world dynamic rebinding requirements.

 
Dynamic Rebinding for Marshalling and Update, with Destruct-time lambda. Gavin Bierman, Michael Hicks, Peter Sewell, Gareth Stoyle, and Keith Wansbrough. In ICFP 2003. [ bib | doi | ps | pdf | http ]
Most programming languages adopt static binding, but for distributed programming an exclusive reliance on static binding is too restrictive: dynamic binding is required in various guises, for example when a marshalled value is received from the network, containing identifiers that must be rebound to local resources. Typically it is provided only by ad-hoc mechanisms that lack clean semantics. In this paper we adopt a foundational approach, developing core dynamic rebinding mechanisms as extensions to simply-typed call-by-value lambda-calculus. To do so we must first explore refinements of the call-by-value reduction strategy that delay instantiation, to ensure computations make use of the most recent versions of rebound definitions. We introduce redex-time and destruct-time strategies. The latter forms the basis for a lambda-marsh calculus that supports dynamic rebinding of marshalled values, while remaining as far as possible statically-typed. We sketch an extension of lambda-marsh with concurrency and communication, giving examples showing how wrappers for encapsulating untrusted code can be expressed. Finally, we show that a high-level semantics for dynamic updating can also be based on the destruct-time strategy, defining a lambda-update calculus with simple primitives to provide type-safe updating of running code. We thereby establish primitives and a common semantic foundation for a variety of real-world dynamic rebinding requirements.
 
Global abstraction-safe marshalling with hash types. James Leifer, Gilles Peskine, Peter Sewell, and Keith Wansbrough. In ICFP 2003. [ bib | doi | ps | pdf | http ]
Type abstraction is a key feature of ML-like languages for writing large programs. Marshalling is necessary for writing distributed programs, exchanging values via network byte-streams or persistent stores. In this paper we combine the two, developing compile-time and run-time semantics for marshalling, that guarantee abstraction-safety between separately-built programs. We obtain a namespace for abstract types that is global, i.e. meaningful between programs, by hashing module declarations. We examine the scenarios in which values of abstract types are communicated from one program to another, and ensure, by constructing hashes appropriately, that the dynamic and static notions of type equality mirror each other. We use singleton kinds to express abstraction in the static semantics; abstraction is tracked in the dynamic semantics by coloured brackets. These allow us to prove preservation, erasure, and coincidence results. We argue that our proposal is a good basis for extensions to existing ML-like languages, pragmatically straightforward for language users and for implementors.
 
Global abstraction-safe marshalling with hash types. James Leifer, Gilles Peskine, Peter Sewell, and Keith Wansbrough. Technical Report UCAM-CL-TR-569, University of Cambridge Computer Laboratory, June 2003. Also published as INRIA Rocquencourt report RR-4851. 86pp. [ bib | ps | pdf | .html ]
Type abstraction is a key feature of ML-like languages for writing large programs. Marshalling is necessary for writing distributed programs, exchanging values via network byte-streams or persistent stores. In this paper we combine the two, developing compile-time and run-time semantics for marshalling, that guarantee abstraction-safety between separately-built programs.

We obtain a namespace for abstract types that is global, ie meaningful between programs, by hashing module declarations. We examine the scenarios in which values of abstract types are communicated from one program to another, and ensure, by constructing hashes appropriately, that the dynamic and static notions of type equality mirror each other. We use singleton kinds to express abstraction in the static semantics; abstraction is tracked in the dynamic semantics by coloured brackets. These allow us to prove preservation, erasure, and coincidence results. We argue that our proposal is a good basis for extensions to existing ML-like languages, pragmatically straightforward for language users and for implementors.

 
Modules, Abstract Types, and Distributed Versioning. Peter Sewell. In POPL 2001. [ bib | doi | errata | pdf | http ]
In a wide-area distributed system it is often impractical to synchronise software updates, so one must deal with many coexisting versions. We study static typing support for modular wide-area programming, modelling separate compilation/linking and execution of programs that interact along typed channels. Interaction may involve communication of values of abstract types; we provide the developer with finegrain versioning control of these types to support interoperation of old and new code. The system makes use of a second-class module system with singleton kinds; we give a novel operational semantics for separate compilation/linking and execution and prove soundness.
 
Modules, Abstract Types, and Distributed Versioning. Peter Sewell. Technical Report UCAM-CL-TR-506, University of Cambridge, September 2000. 46pp. [ bib | errata | ps | pdf | .html ]
In a wide-area distributed system it is often impractical to synchronise software updates, so one must deal with many coexisting versions. We study static typing support for modular wide-area programming, modelling separate compilation/linking and execution of programs that interact along typed channels. Interaction may involve communication of values of abstract types; we provide the developer with fine-grain versioning control of these types to support interoperation of old and new code. The system makes use of a second-class module system with singleton kinds; we give a novel operational semantics for separate compilation/linking and execution and prove soundness.