Geometric Deep Learning

Principal lecturers: Prof Pietro Lio', Dr Petar Veličković
Taken by: MPhil ACS, Part III
Code: L65
Term: Lent
Hours: 16 (12hrs lectures + 4hrs practicals)
Format: In-person lectures
Class limit: max. 50 students
Prerequisites: Experience with machine learning and deep neural networks is recommended. Knowledge of concepts from graph theory and group theory is useful, although the relevant parts will be explicitly retaught.
Moodle, timetable

Aims and Objectives

Most of the patterns we see in nature can be elegantly reasoned about using spatial symmetries—transformations that leave underlying objects unchanged. This observation has had direct implications for the development of modern deep learning architectures that are seemingly able to escape the “curse of dimensionality” and fit complex high-dimensional tasks from noisy real-world data. Beyond this, one of the most generic symmetries—the permutation—will prove remarkably powerful in building models that reason over graph structured-data, which is an excellent abstraction to reason about naturally-occurring, irregularly-structured data. Prominent examples include molecules (represented as graphs of atoms and bonds, with three-dimensional coordinates provided), social networks and transportation networks. Several already-impacted application areas include traffic forecasting, drug discovery, social network analysis and recommender systems. The module will provide the students the capability to analyse irregularly- and nontrivially-structured data in an effective way, and position geometric deep learning in a proper context with related fields. The main aim of the course is to enable students to make direct contributions to the field, thoroughly assimilate the key concepts in the area, and draw relevant connections to various other fields (such as NLP, Fourier Analysis and Probabilistic Graphical Models). We assume only a basic background in machine learning with deep neural networks.

Learning outcomes

The framework of geometric deep learning, and its key building blocks: symmetries, representations, invariance and equivariance
Fundamentals of processing data on graphs, as well as impactful application areas for graph representation learning
Theoretical principles of graph machine learning: permutation invariance and equivariance
The three "flavours" of spatial graph neural networks (GNNs) (convolutional, attentional, message passing) and their relative merits. The Transformer architecture as a special case.
Attaching symmetries to graphs: CNNs on images, spheres and manifolds, Geometric Graphs and E(n)-equivariant GNNs
Relevant connections of geometric deep learning to various other fields (such as NLP, Fourier Analysis and Probabilistic Graphical Models)

Lectures

The lectures will cover the following topics:

Learning with invariances and symmetries: geometric deep learning. Foundations of group theory and representation theory.
Why study data on graphs? Success stories: drug screening, travel time estimation, recommender systems. Fundamentals of graph data processing: network science, spectral clustering, node embeddings.
Permutation invariance and equivariance on sets and graphs. The principal tasks of node, edge and graph classification. Neural networks for point clouds: Deep Sets, PointNet; universal approximation properties.
The three flavours of spatial GNNs: convolutional, attentional, message passing. Prominent examples: GCN, SGC, ChebyNets, MoNet, GAT, GATv2, IN, MPNN, GraphNets. Tradeoffs of using different GNN variants.
Graph Rewiring: how to apply GNNs when there is no graph? Links to natural language processing---Transformers as a special case of attentional GNNs. Representative methodologies for graph rewiring: GDC, SDRF, EGP, DGCNN.
Expressive power of graph neural networks: the Weisfeiler-Lehman hierarchy. GINs as a maximally expressive GNN. Links between GNNs and graph algorithms: neural algorithmic reasoning.
Combining spatial symmetries with GNNs: E(n)-equivariant GNNs, TFNs, SE(3)-Transformer. A deep dive into AlphaFold 2.
Worked examples: Circulant matrices on grids, the discrete Fourier transform, and convolutional networks on spheres. Graph Fourier transform and the Laplacian eigenbasis.

Practicals

The practical is designed to complement the knowledge learnt in lectures and teach students to derive additional important results and architectures not directly shown in lectures. The practical will be given as a series of individual exercises (each either code implementation or proof/derivation). Each of these exercises can be individually assessed based on a specified mark budget.

Possible practical topics include the study of higher-order GNNs and equivariant message passing.

Assessment

(60%) Group Mini-project (writeup) at the end of the course. The mini projects can either be self-proposed, or the students can express their preference for one of the provided topics, the list of which will be announced at the start of term. The projects will consist of implementing and/or extending graph representation learning models in the literature, applying them to publicly available datasets. Students will undertake the project in pairs and submit a joint writeup limited to 4,000 words (in line with other modules); appendix of work logs to be included but ungraded;
(10%) Short presentation and viva: students will give a short presentation to explain their individual contribution to the mini-project and there will be a short viva following.
(30%) Practical work completion. Completing the exercises specified in the practical to a satisfactory standard. The practical assessor should be satisfied that the student derived their answers using insight gained from the course; coupled with original thought, not by simple copy-pasting of relevant related work. The students would submit code and a short report, which would then be marked in line with the predetermined mark budget for each practical item.

The students will learn how to run advanced architectures on GPU but no specific need for dedicated GPU resources. Practicals will be made possible to do on CPU; if required, students can use GPUs on publicly available free services (such as Colab) for their mini-project work.

References

The course will be based on the following literature:

"Geometric Deep Learning: Grids, Graphs, Groups, Geodesics, and Gauges", by Michael Bronstein, Joan Bruna, Taco Cohen and Petar Veličković
"Graph Representation Learning", by Will Hamilton
"Deep Learning", by Ian Goodfellow, Yoshua Bengio and Aaron Courville.