Computer Laboratory Home Page Search A-Z Directory Help
University of Cambridge Home Computer Laboratory
Introduction to Functional Programming
Computer Laboratory > Course material 2004-05 > Introduction to Functional Programming

Introduction to Functional Programming
2004-05

Principal lecturer: Mr Russell Ross
Taken by: Part II (General), Diploma

Syllabus
Past exam questions

Lecture Slides

  1. 20 January 2005—Introduction and overview
  2. 25 January 2005—Basic types and functions
  3. 27 January 2005—Lists and patterns
  4. 1 February 2005—List functionals
  5. 3 February 2005—Functional queues and sorting
  6. 8 February 2005—Datatypes and binary trees
  7. 10 February 2005—Red-black trees and Huffman encoding
  8. 15 February 2005—Arrays and random-access lists
  9. 17 February 2005—Lazy lists
  10. 22 February 2005—Type inference
  11. 24 February 2005—Symbolic differentiation
  12. 1 March 2005—Recursive descent parsers

Exercise Sheets

Here are some practical exercises that you can work on. I plan to post three sheets in total. For additional practice, try the tick exercises from Larry Paulson's Part IA course.

Texts

References

Other Notes

Lecture 1—Introduction and overview

Make sure you have access to at least one SML system: Objective Caml is a different dialect of ML that is quite popular. It has the best compiler and an active user community.

Lecture 3—Lists and patterns

The slides incorrectly refer to case … of … end, but case … of expressions don't terminate with end. See lecture 4 for an example of case … of.

Lecture 4—List functionals

In an earlier lecture, someone asked how to define your own infix operators. There are two steps required. Declare the operator as infix:
infix 5 ++
Here 5 indicates the precedence of this operator. Use infixr for right-associative operators. Then when defining the operator, use op as in this example:
fun op ++ (a, b) = a + b
Note that alphabetic names work fine as well (like the built-in o operator).

A student asked if orelse and andalso had non-short-circuit versions. The basis library has this to say about it:

The semantics of strict AND and OR operators, which would evaluate both expressions before applying the operator, are rarely needed and can easily be obtained using the andalso and orelse operators.

Finally, I challenged you at the end of the lecture to figure out why the version of decode using simple recursion inverts the dictionary every time it is used, while the version using functionals only inverts it once. To understand the answer, follow the dict value from the definition of decode through to the definition of encodelist. When does each curried function have all the arguments it needs to evaluate its body?

Lecture 5—Functional queues and sorting

Our discussion of sorting was spoiled somewhat when we were forced to use Moscow ML instead of SML/NJ for our tests, and some tests were too slow to finish in time. I've done the testing on SML/NJ with the following results:

Input data Insertion sort Quick sort Merge sort
10k random 2.63 0.03 0.02
10k sorted 6.04 7.92 0.01
10k reversed 0.00 13.43 0.01
100k random 658.97 0.71 0.38
100k sorted 1496.91 1236.32 0.24
100k reversed 0.05 2626.53 0.34
1000k random 24.23 13.49
1000k sorted 8.97
1000k reversed 1.86 7.26

All times are in seconds. Times under a minute represent the average of 10 runs, while those over a minute are single runs.

Lecture 7—Red-black trees and Huffman encoding

In addition to the overhead slides, I prepared two things to help understand red-black trees:
  1. A diagram showing the cases handled by balance
  2. An interactive red-black tree demo applet
For the Huffman encoding, I encourage you to write the necessary code to actually encode something and decode it. The two hard parts are converting strings into lists of R and L values and gathering lists of character frequencies.

For the former, look into the Word basis library to extract bits from an int (remember that ord turns a character into an int and explode turns a string into a list of characters).

An easy way to count character frequencies is to sort the list of characters (which puts all occurrences of a character together) and then count how many characters occur in each group. You can do this with a fold or a simple recursive function.

Putting all the pieces together and making the little changes necessary (turning a red-black set into a red-black dictionary) is a good way to make sure you understand what is going on and get a little practice working with the code we've explored. If you run into difficulties, send me an email.

Lecture 10—Type inference

Type inference is an incremental process:
  • Assign type variables to program variables
  • Examine expressions to find relationships between type variables
  • Unify the type variables based on those relationships
We start out by assigning a type variable to every variable in the expression. At this stage, the only type variables we can unify are those assigned to the same variable, i.e., if x is given type variable α in one place, the same x must be given type α every place it is used.

Next, we look at expressions to see how types are used together. If two variables f and x are put next to each other (juxtaposed) in a way that implies function application, e.g., (f x), then we can glean new information from this: f must be a function with x's type as its input and some other type as its output. By examining the expressions (and not just the variables in the expressions), we have more information to help us unify the types: now we know that if x has type γ, then f must have type &gamma→ε for some ε. We go back and find every instance of f's old type and unify it with the new type information.

Any time we unify types, we first have to get some new information to drive the process. That information comes the expressions, including, among other things:

  • Variable identity: x is the same type as x when they are the same x
  • Constant expressions: [] is type α list and 5 is type int
  • Type constructors: (x, y) is x's type * y's type, and Br x is of type α tree and implies that x must be of type α * α tree * α tree
  • Function application: f x means f must be a function from x's type to some other type ε, and the whole expression f x has type ε
  • Pattern matching: the right-hand-side expressions in patterns must all have the same type, so in fn 0 => 1 | n => g n, g must be a function from int (this comes from n being the input argument) to int (the type of the expression g n must match the type of 1, which is known to be int).

This process is repeated until we've infered all type relations that we can from the expressions and unified all types as much as possible. What we are left with is the most general type for the expression.