Project suggestions

Project Suggestions in Applied Semantics, 2004-2005

(these will be updated from time to time)

The projects below are outline suggestions - feel free to come and discuss how they could be turned into full proposals, or to suggest others in the same general area. Contact us by email: Peter Sewell, Keith Wansbrough.

Acute runtime implementation
TCP - model implementation and flow-control parameterisation
A Debian apt undo
The Munger
An Informal Proof Assistant
Automated Testing from a Formal Specification
Distributed Infrastructure above the TCP/UDP/Sockets semantics
Post-hoc Makefile Semantics

Acute runtime implementation

Project description:

A group of us have developed the experimental programming language acute for high-level distributed programming. The language has a complete formal definition and an implementation (in Fresh OCaml). The runtime is essentially an interpreter operating over the abstract syntax (extended with closures and explicit representations of evaluation contexts).

This project is to implement a more efficient runtime for all or part of the language. Particularly interesting would be to target the OCaml bytecode, or a mild variation thereof.

Special resources:

TCP - model implementation and flow-control parameterisation

Project description:: We're working on a post-hoc specification of TCP - as it is actually implemented. The specification is expressed in Higher-Order Logic (with the HOL tool); it's being validated against the behaviour of linux, BSD and WXP stacks. This project would involve writing a clean implementation of TCP based on our spec (with code structured for clarity rather than maximum speed), likely in OCaml. Initially it could be run purely in userspace (we have OCaml libraries for packet injection and monitoring that could be used); perhaps later it could be tied in with the linux or BSD sockets interface. If the project goes well, an interesting extension would be to refactor the code to enable the existing congestion control algorithms to be smoothly replaced by recently-proposed alternatives - congestion marking, etc.
Special resources:: For the later parts: a linux or BSD machine; some test environment.

A debian apt-undo

Project description:: Package management operations really should support an undo operation. Details to be discussed.
Special resources:

The Munger, and An Informal Proof Assistant (several possible projects here)

Project description:

Working with realistic programming language type systems or operational semantics can be tiresome, simply because of the size of the definitions. Better tool support would be very useful, for typesetting, for informal proof, for animation, for use as an oracle for automatic testing of implementations, and for machine-checked proof. For an informal proof tool one might support variable binding, definition instantiation, multi-case definitions, derivation-search for inductive definitions, case management in proofs, view-folding, and pretty-printing.

For pretty-printing, we currently have a system (the Munger mk.2) that typesets into LaTeX mathematics that has been input directly in (ASCIIfied) conventional notation. One possibility would be to redesign this from the ground up, so one could specify the grammar for ASCII notation, together with the LaTeX it should be rendered into, something like this:

e ::= () { () } ( COMMALIST(es) ) { ( COMMALIST(es) ) } \x:T.e { \lambda x : T. e } col e hm T { [ e ]_{ hm }^{ T } }

and have a lexer, parser, and pretty-printer automatically generated. This would let one write informal ASCII mathematics in perhaps the most lightweight style possible and still produce production-quality typeset output. Choosing an appropriate class of grammars is an interesting question here: if LALR(1) is not rich enough to deal with our informal grammars (or if yacc is too tiresome), we could think of taking advantage of the fact that we rarely want to munge much code at a time, so could use a class of grammars that is less efficiently (but more simply?) parsable. Going further, one might do some lightweight sanity checking of the syntax.

Another possibility is to produce an implementation of some of the other features as an extension to emacs. It would require some understanding of informal proofs, careful user-interface design, and a certain amount of Emacs lisp.

A further option would be to focus on variable binding, substitution, rule instantiation, etc, implementing the Nominal Unification algorithm recently developed by Urban, Pitts and Gabbay, implementing a backtracking search for rule instantiation, and tying that into an editor and/or pretty-printer. See James Cheney's alphaProlog.

Some more detailed discussion of desirable features is available on request. For all of these, most of the system would probably be best written in OCaml or ML.

Alisdair Wren also has ideas along these lines, and may be prepared to supervise a project.

Special resources:

Distributed Infrastructure above the TCP/UDP/Sockets semantics

Project description:: A group of us are producing a post-hoc semantics for UDP and TCP socket communication (see here), generating a precise formal model of the behaviour of common implementations (Linux, FreeBSD, Windows XP). This project is to select and implement some interesting distributed algorithms, above our OCaml language binding for the modelled sockets primitives, using the model as an informal reference. Candidate algorithms include peer-to-peer algorithms such as Chord or Pastry, or randomized algorithms such as van Renesse's Gossip-style Failure Detection Service.
Special resources:

Post-hoc Makefile semantics

Project description:: This project is to produce a post-hoc semantics for the language of Makefiles (or a similar build system) and to test it against the implementations.
Special resources:

Peter.Sewell@cl.cam.ac.uk

e ::=
	()	{ () }
	( COMMALIST(es) )	{ ( COMMALIST(es) ) }
	\x:T.e	{ \lambda x : T. e }
	col e hm T	{ [ e ]_{ hm }^{ T } }