Computer Laboratory Introduction to Functional Programming

Computer Laboratory > Course material 2004-05 > Introduction to Functional Programming

Introduction to Functional Programming
2004-05

Principal lecturer: Mr Russell Ross
Taken by: Part II (General), Diploma

Syllabus
Past exam questions

Lecture Slides

Exercise Sheets

Here are some practical exercises that you can work on. I plan to post three sheets in total. For additional practice, try the tick exercises from Larry Paulson's Part IA course.

Sheet one <pdf> <ps>
Sheet two <pdf> <ps>
Sheet three <pdf> <ps>

Texts

ML for the Working Programmer. This is the primary text; I recommend reading the related sections in parallel with lectures.
Purely Functional Data Structures. An excellent look at familiar (and unfamiliar) data structures in a functional context.
Can programming be liberated from the von Neumann style? A good discussion of what functional programming is about and why it is important. Read this early on, then re-read it later in the term when it will make more sense.
The Lambda Calculus: Its Syntax and Semantics. The definitive work on the theoretical foundations of functional programming. Read the Wikipedia article if you just want the highlights.
The next 700 programming languages. A classic article looking at characteristics of languages and language families. Widely considered one of the greatest works on programming languages.

References

The Standard ML basis library

Other Notes

Lecture 1—Introduction and overview

Make sure you have access to at least one SML system:

Moscow ML is easy to install and use
Standard ML of New Jersey is better for writing real programs

Objective Caml is a different dialect of ML that is quite popular. It has the best compiler and an active user community.

Lecture 3—Lists and patterns

The slides incorrectly refer to case … of … end, but case … of expressions don't terminate with end. See lecture 4 for an example of case … of.

Lecture 4—List functionals

In an earlier lecture, someone asked how to define your own infix operators. There are two steps required. Declare the operator as infix:

infix 5 ++

Here 5 indicates the precedence of this operator. Use infixr for right-associative operators. Then when defining the operator, use op as in this example:

fun op ++ (a, b) = a + b

Note that alphabetic names work fine as well (like the built-in o operator).

A student asked if orelse and andalso had non-short-circuit versions. The basis library has this to say about it:

The semantics of strict AND and OR operators, which would evaluate both expressions before applying the operator, are rarely needed and can easily be obtained using the andalso and orelse operators.

Finally, I challenged you at the end of the lecture to figure out why the version of decode using simple recursion inverts the dictionary every time it is used, while the version using functionals only inverts it once. To understand the answer, follow the dict value from the definition of decode through to the definition of encodelist. When does each curried function have all the arguments it needs to evaluate its body?

Lecture 5—Functional queues and sorting

Our discussion of sorting was spoiled somewhat when we were forced to use Moscow ML instead of SML/NJ for our tests, and some tests were too slow to finish in time. I've done the testing on SML/NJ with the following results:

Input data Insertion sort Quick sort Merge sort

10k random 2.63 0.03 0.02

10k sorted 6.04 7.92 0.01

10k reversed 0.00 13.43 0.01

100k random 658.97 0.71 0.38

100k sorted 1496.91 1236.32 0.24

100k reversed 0.05 2626.53 0.34

1000k random 24.23 13.49

1000k sorted 8.97

1000k reversed 1.86 7.26

All times are in seconds. Times under a minute represent the average of 10 runs, while those over a minute are single runs.

Lecture 7—Red-black trees and Huffman encoding

In addition to the overhead slides, I prepared two things to help understand red-black trees:

A diagram showing the cases handled by balance
An interactive red-black tree demo applet

For the Huffman encoding, I encourage you to write the necessary code to actually encode something and decode it. The two hard parts are converting strings into lists of R and L values and gathering lists of character frequencies.

For the former, look into the Word basis library to extract bits from an int (remember that ord turns a character into an int and explode turns a string into a list of characters).

An easy way to count character frequencies is to sort the list of characters (which puts all occurrences of a character together) and then count how many characters occur in each group. You can do this with a fold or a simple recursive function.

Putting all the pieces together and making the little changes necessary (turning a red-black set into a red-black dictionary) is a good way to make sure you understand what is going on and get a little practice working with the code we've explored. If you run into difficulties, send me an email.

Lecture 10—Type inference

Type inference is an incremental process:

Assign type variables to program variables
Examine expressions to find relationships between type variables
Unify the type variables based on those relationships

We start out by assigning a type variable to every variable in the expression. At this stage, the only type variables we can unify are those assigned to the same variable, i.e., if x is given type variable α in one place, the same x must be given type α every place it is used.

Next, we look at expressions to see how types are used together. If two variables f and x are put next to each other (juxtaposed) in a way that implies function application, e.g., (f x), then we can glean new information from this: f must be a function with x's type as its input and some other type as its output. By examining the expressions (and not just the variables in the expressions), we have more information to help us unify the types: now we know that if x has type γ, then f must have type &gamma→ε for some ε. We go back and find every instance of f's old type and unify it with the new type information.

Any time we unify types, we first have to get some new information to drive the process. That information comes the expressions, including, among other things:

Variable identity: x is the same type as x when they are the same x
Constant expressions: [] is type α list and 5 is type int
Type constructors: (x, y) is x's type * y's type, and Br x is of type α tree and implies that x must be of type α * α tree * α tree
Function application: f x means f must be a function from x's type to some other type ε, and the whole expression f x has type ε
Pattern matching: the right-hand-side expressions in patterns must all have the same type, so in fn 0 => 1 | n => g n, g must be a function from int (this comes from n being the input argument) to int (the type of the expression g n must match the type of 1, which is known to be int).

This process is repeated until we've infered all type relations that we can from the expressions and unified all types as much as possible. What we are left with is the most general type for the expression.



© 2005 Mr Russell Ross & Dr Neil Dodgson Please send any comments to pagemaster@cl.cam.ac.uk Page last updated on 28-Feb-2005 at 16:10 by Russell Ross

Introduction to Functional Programming2004-05