Algorithms for finding structures within graphs

This code accompanies the Algorithms lecture course for computer science at Cambridge University, taught by Damon Wischik.

Contents

Max-flow using linear algebra

The Ford-Fulkerson algorithm, for computing a maximum flow on a network, is coursework, so you won't find it here! Instead, here is another algorithm for finding a maximum flow, based on linear algebra.

We write out the max-flow problem in linear algebra as follows. Let $E$ be the set of edges, and let $\mathbf{c}\in\mathbb{R}^E$ and $\mathbf{f}\in\mathbb{R}^E$ be the capacities of each edge, and the flow on each edge, expressed as vectors.

Flow constraints

Flow conservation says that, at every vertex $v$ except for the source and the sink, $$ \sum_e f_e 1_{\operatorname{end}(e)=v} = \sum_e f_e 1_{\operatorname{start}(e)=v} $$ In words, the sum of flows on all edges wthat end at $v$ is equal to the sum of flows on all edges that start at $v$. The notation $1_X$ is called the indicator function, also known as one-hot coding; it is equal to $1$ when the condition $X$ is true, and $0$ when $X$ is false. We can rewrite this in matrix notation as $$ \sum_e A_{v e} f_e=0 \qquad \text{where} \qquad A_{v e}=1_{\operatorname{start}(e)=v} - 1_{\operatorname{end}(e)=v} $$ or, even more succinctly, $$ A \mathbf{f} = 0 $$ where $A$ is a matrix with $V-2$ columns and $E$ rows.

There is another constraint on $\mathbf{f}$. For it to be a valid flow, it also has to satisfy the capacity constraints: it can't be negative, and it has to fit within capacity. In vector notation, $\mathbf{f}\geq\mathbf{0}$ and $\mathbf{f}\leq\mathbf{c}$.

Flow value

The value of $\mathbf{f}$ is the net flow out of the source, in other words $$ \operatorname{value}(\mathbf{f}) = \sum_e f_e b_e \qquad\text{where}\qquad b_e=1_{\operatorname{start}(e)=s} - 1_{\operatorname{end}(e)=s} $$ Or, in vector notation, $$ \operatorname{value}(\mathbf{f}) = \mathbf{b}^\mathsf{T} \mathbf{f} $$

Max flow

In summary, the max-flow problem can be written as $$ \begin{align} &\text{maximize} && \mathbf{b}^\mathsf{T}\mathbf{f}\\ &\text{over} && \mathbf{f}\in\mathbb{R}^E,\; \mathbf{f}\geq \mathbf{0},\; \mathbf{f}\leq \mathbf{c}\\ &\text{such that} && A\mathbf{f} = \mathbf{0}. \end{align} $$ This is an example of a linear programming problem, for which there are standard library solvers. All we need to do is set up the vectors and matrices, then call the library routine. The standard solvers do not respect the Integrality Lemma, i.e. there is no guarantee that we'll end up with an integer flow, even if all capacities are integer.

Force-weighted graph layout

This graph is from Algorithms section 4.3 by Sedgewick and Wayne. They plot the graph and show a minimum spanning tree.

Prim's algorithm

Kruskal's algorithm

Toposort