Computer Laboratory

Dominic's Part II Project Proposals 2012/13

The following are sketches of project ideas and are not complete (or static) proposals . If you are interested, please do get in touch.

• OCamlLabs: Types for cross-compilation

Increasing numbers of applications are targetting heterogenous execution environments such as server-side, client-side, GPU, clouid, embedded and web. This is supported by a number of OCaml compiler backends (JavaScript, Java, ARM, LLVM) and libraries allowing interoperability. In such a scenario, certain parts of code may be compiled only for certain platform-specific environments, but some can be safely shared. This project should extend the OCaml type system in order to track execution environments for which an expression can be compiled, and simplify writing code that cross-compiles for multiple platforms and diverse execution environments.

This project idea comes from the OCamlLabs web site (which includes some links to related material) and is also included in Tomas Petricek's project suggestions. Either of us would be happy to supervise/be involved. As the name suggests, this should be done as an extension to the OCaml compiler, but if you're interested in other tools and languages, do talk to either of us.

• TAKEN - EDSL (Embedded Domain Specific Language) for Array Programming on GPUS

Stencil computations are a common programming pattern in scientific computing, games, and image processing, involving operations over the elements of an array defined by a local operation, possibly depending on some neighbourhood of elements. For example, the discrete Gaussian blur operation can be defined by the operation:

(A[i-1][j] + A[i+1][j] + A[i][j-1] + A[i][j+1] + 4*A[i][j]) / 6.0
which can then be applied to an array by iterating over its index space.

Stencil computations are highly regular and amenable to optimisation and parallelisation. However, general-purpose languages obscure this regular pattern from the compiler, and even the programmer, preventing optimisation and obfuscating (in)correctness.

The Ypnos domain-specific language, developed in the CPRG, provides support for stencil computations embedded in Haskell. Ypnos allows declarative, abstract specification of stencil computations, exposing the structure of a problem to the compiler and to the programmer via specialised syntax. The discrete Gaussian operator can be written in Ypnos as:

gauss | _ l  _ | = (l + r + b + r + 4*c)/6.0
| r @c b |
| _ t  _ |

where the figure, spanning three lines following guass and preceding =, is a special pattern matching syntax called a grid pattern, defining array access. Since pattern matching is static, infering the data access pattern for a stencil computation is simple (essentially parsing). As Ypnos is embedded in Haskell, grid patterns are provided by a small syntactic extension using Haskell's macroing system.

A stencil computation is applied to an array, or grid in the Ypnos terminology, using a higher order function, such as run, e.g., x' = run gauss x.

Various language invariants and program properties are encoded as Haskell types and enforced by the type system of the GHC compiler. For example, the gauss function has type:

gauss :: (Safe (-1, 0) b, Safe (+1, 0) b, Safe (0, +1) b, Safe (0, -1) b, Num a)
=> Grid (Dim X :* Dim Y) b dyn a -> a

The type class constraint Safe is used to ensure that the parameter grid has adequate boundary elements defined such that the guass operator never causes an out-of-bounds error when applied using run.

Currently there is an Ypnos implementation for two-dimensional arrays, but many of the core combinators are missing. Ypnos programs are highly data parallel, and in principle could be computed on a GPU. The aim of this project is to extend the current implementation of Ypnos with GPU support (perhaps using GHC's ACCELERATE library), and extend the combinators currently provided to make programming in Ypnos more practical. Prior familiarity with Haskell is highly recommended. Other extensions include extending Ypnos to other data types (trees, meshes, graphs).

There is scope for a project considering more theoretical aspects, such as working out a scheme for data-type generic grid pattern syntax, and exploring the mathematical structure behind Ypnos (based in category theory).

Resources   The current implementation can be found on GitHub. The original Ypnos paper describes the general motivation and philosophy of Ypnos as well as explaining its general features. A followup paper describes in more depth the type system of core Ypnos features, using the advanced type system features of GHC Haskell.

• Inductive programming with ellipses and subscripts

Ellipsis notation (...) is commonly use in mathematics and in pseudo-code as syntactic sugar for iteration, repetition, sequences, extension of a pattern, ... .

For example, summation of numbers 0 to n might be written:

0 + 1 + ... + n
or the summation of elements in a set might be written as the following, using subscripts to index the elements:
\sum x = x0 + ... + x{n-1}
(some more examples can be found here: Ellipsis/In mathematical notation (Wikipedia).

Haskell has some support for defining (inclusive) ranges using ellipses and the list notation e.g.

[0..9] == [0,1,2,3,4,5,6,7,8,9]
['a'..'h'] == ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'] = "abcdefgh"
[0..] == [0,1,2,3,4,5,6,7,8,9,10,11,12,... -- infinite list
[False .. True] = [False, True]
[True .. False] = []
[0,2..9] == [0,2,4,6,8] -- note different form [a,b..c] where interval is specified
[10..1] == []
[10,9..1] == []
[0,2..] = [0,2,4,6,... -- infinite list
[0.1..5] == [0.1, 1.1, 2.1, 3.1, 4.1, 5.1] -- slightly oddly

Other languages use ellipses in similar ways, such as ranges in Perl, Ruby and variable arguments in C.

In Haskell, any type which is an instance of the Enum type class has range operations using ellipses (see Enum documentation), which is defined for many orderable types.

Ellipses and subscripting notations are compact and concise and used pervasively. Why can't we use them in programming more widely and form more operation than just defining ranges?

The aim of this project is to extended support for programming with ellipses and subscripts.

1. Compare:
map f [] = []
map f (x:xs) = (f x):(map f xs)
with:
map f [x_0, ..., x_n] = [f x_0, ..., f x_n]
2. Compare:
g [] = 0
g (x:xs) = x + (g xs)
with:
g [x_0, ..., x_n] = x_0 + ... + x_n

Can this be extended to inductively-defined types? For example:

data BTree a = Leaf a | Node a (BTree a) (BTree a)

treeMap f (Node x (... Leaf y) (... Leaf z)) = Node (f x) (... (Leaf (f y)) (... (Leaf (f z))))

The project might take the form of extending the GHC compiler for Haskell with an extension for programming with ellipsis/subscripts, or providing macro support using Haskell's macro/quotation system.