Binary search trees are simple and perform well on random
    data, but they perform poorly on ordered data, degrading to
    O(n) performance for common operations.
    Red-black trees are a popular family of balanced binary search
    trees, which keep the tree approximately balanced.
Every node in a red-black tree is colored either red or black. We maintain two invariants
Together, these invariants ensure that the longest possible path—one with alternating red and black nodes—is no more than twice as long as the shortest possible path—one with black nodes only.
* Based on Section 3.3 of Purely Functional Data Structures
Recall the datatype of a basic binary search tree
datatype 'a tree = Lf | Br of 'a * 'a tree * 'a tree
We create a color type
datatype color = R | B
and augment the binary search tree type to include a color
    field. All Lf nodes are considered to be black.
datatype 'a rbtree =
    Lf | Br of color * 'a * 'a rbtree * 'a rbtree
lookup is similar to a basic binary search tree
- fun lookup Lf _ = false
    | lookup (Br (_, elt, l, r)) x =
        if x < elt then lookup l x
        else if x > elt then lookup r x
        else true;
> val lookup = fn : int rbtree -> int -> bool
    We ignore the color field, and everything else is the same.
insert must maintain the two invariants, so it
    has a few more changes
- fun insert root x =
    let fun ins Lf = Br (R, x, Lf, Lf)
          | ins (node as Br (color, elt, l, r)) =
              if x < elt then balance (color, elt, ins l, r)
              else if x > elt then balance (color, elt, l, ins r)
              else node
        val Br (_, elt, l, r) = ins root (* never returns Lf *)
    in Br (B, elt, l, r) end;
> val insert = fn : int rbtree -> int -> int rbtree
    There are three significant changes
ins Lf case)insBr constructor with
	calls to the balance function in the two
	recursive cases.The balance function works like a
    Br constructor, but it rearranges its arguments
    when necessary to maintain the balance invariants.
Making all new nodes red maintains invariant 2, but violates
    invariant 1 if the parent of the new node is also red.
    balance works by fixing this red-red violation,
    possibly creating a new one higher up the tree.
At most one red-red violation exists at a time, and the calls
    to balance make it bubble up the tree to the root.
    Change #2 in the insert function fixes this case by
    always making the root node black.
See the figure to understand how balance
    works.
In each case, balance transforms a black node
    with a red child and a red grandchild into the same
    configuration—a red node with two black children.
The new red node may have a red parent, but this will be handled at the next level up as the recursion unwinds.
In each of the red-red cases, we label the values so that an in-order traversal yields the sequence {A, X, B, Y, C, Z, D}, which matches the output configuration.
Pattern matching makes the code amazingly simple and elegant (compare this to an implementation in Java or a similar language). We simply describe the four possible cases, and in each case we construct the exact same thing. The code is a direct transcription of the diagram
- fun balance (B, z, Br(R, y, Br(R, x, a, b), c), d) = …
    | balance (B, z, Br(R, x, a, Br(R, y, b, c)), d) = …
    | balance (B, x, a, Br(R, z, Br(R, y, b, c), d)) = …
    | balance (B, x, a, Br(R, y, b, Br(R, z, c, d))) =
          Br(R, y, Br(B, x, a, b), Br(B, z, c, d))
    | balance body = Br body;
> val 'a balance = fn : color * 'a * 'a rbtree * 'a rbtree ->
    'a rbtree
    Here the first three cases are abbreviated for readability, but all four cases have the same code as the fourth case shown here.
Another interesting kind of tree is useful for data encoding and forms the basis of Huffman data compression.
We construct a binary tree with a character stored in each leaf. Instead of coding a character as a byte (8 bits), we encode it as the path through the tree to find the leaf with that character.
Consider how a decode would work: if it starts at the root of the tree, then a zero means follow the left branch, and a one means follow the right branch. When it hits a leaf, it outputs the character stored in that leaf and starts back at the root to find the next character.
If we put frequently used characters near the root and infrequently used characters deeper in the tree, then our encoding will be more efficient than using a fixed number of bits for each character. In the worst case it will be exactly the same.
This encoding scheme relies on the prefix property: no character encoding can be the prefix of any other.
We define two datatypes
- datatype direction = L | R; - datatype huff = C of int * char | B of int * huff * huff;
The int field will hold the number of
    occurrences of a char in the case of
    C, or all chars under the given node
    in the case of B.
The decode1 function takes a Huffman tree and a
    list of directions, and returns the next character found and the
    remaining directions
- exception Bad_input;
> exn Bad_input = Bad_input : exn
- fun decode1 (C (_, c)) rest = (c, rest)
    | decode1 (B (_, l, _)) (L :: rest) = decode1 l rest
    | decode1 (B (_, _, r)) (R :: rest) = decode1 r rest
    | decode1 _ _ = raise Bad_input;
> val decode1 = fn : huff -> direction list ->
    char * direction list
    Using decode1 to decode a full string is
    straightforward. A typical implementation will include a special
    sentinal value in the tree that marks the end-of-input.
Encoding is trickier. Here is a simple first attempt
- fun member (C (_, c)) ch = c = ch
    | member (B (_, l, r)) ch = member l ch orelse member r ch;
> val member = fn : huff -> char -> bool
- fun encode1 (C _) _ = []
    | encode1 (B (_, l, r)) ch =
        if member l ch then L :: encode1 l ch
        else R :: encode1 r ch;
> val encode1 = fn : huff -> char -> direction list
    The problem with this approach is in the repeated calls to
    member. Since member is an
    O(n) operation, in the worst case
    encode1 takes O(n2).
We can improve encoding time to O(n) for
    each character with this version of encode1
- fun encode1 tree ch =
    let fun f (C (_, c)) = if c = ch then [[]] else []
          | f (B (_, l, r)) = map (fn x => L :: x) (f l) @
                              map (fn x => R :: x) (f r)
    in hd (f tree) end;
> val encode1 = fn : huff -> char -> direction list
    Or we can use exceptions to quit searching when we find the path
- exception Success of direction list;
> exn Success = fn : direction list -> exn
- fun encode1 tree ch =
    let fun f (C (_, c)) a = if c <> ch then ()
                             else raise (Success (rev a))
          | f (B (_, l, r)) a = (f l (L :: a);
                                 f r (R :: a))
    in (f tree []; []) handle Success res => res
    end;
> val encode1 = fn : huff -> char -> direction list
A better approach is to gather all encodings and put them in a dictionary, e.g., a red-black tree dictionary. We can gather the encodings by walking over all the nodes in the tree and keeping track of every path we take
- fun encodeGather tree =
    let fun f (C (_, c)) lst a = (c, rev lst) :: a
          | f (B (_, l, r)) lst a =
                f r (R :: lst) (f l (L :: lst) a)
    in f tree [] [] end;
> val encodeGather = fn : huff -> (char * direction list) list
    This function gathers a list of
    (char, encoding) pairs, which
    can easily be folded into a dictionary.
Constructing an optimal Huffman tree is quite simple. Suppose
    we are given a list of characters and their relative frequencies
    as a list of (freq, char) pairs
    in order from least frequent to most.
 First we convert this into a list of Huffman trees using
    map C. Then we repeatedly combine the two
    least frequent Huffman trees until we have just one tree
    left. We start with a few helper functions
- fun freq (C (x, _)) = x
    | freq (B (x, _, _)) = x;
> val freq = fn : huff -> int
- fun insert elt [] = [elt]
    | insert elt (lst as x::xs) =
        if freq elt < freq x then elt :: lst
        else x :: insert elt xs;
> val insert = fn : huff -> huff list -> huff list
Then actually constructing the tree is simple
- fun build [] = raise Bad_input
    | build [x] = x
    | build (x1::x2::xs) =
        build (insert (B (freq x1 + freq x2, x1, x2)) xs);
> val build = fn : huff list -> huff
    build repeatedly combines the two trees with
    lowest frequency (the first two in a sorted list) and puts the
    result back into the list until there is only a single tree
    left.