ALEXANDRIA: Large-Scale Formal Proof for the Working Mathematician

L. C. Paulson, Computer Laboratory, University of Cambridge

And the postdocs: Anthony Bordg, Angeliki Koutsoukou-Argyraki, Wenda Li, Yiannos Stathopoulos, and associated team members: Dr Sean Holden, Chelsea Edmonds, Albert (Qiaochu) Jiang, Chaitanya Mangla.

From the proposal

Mathematical proofs have always been prone to error. Today, proofs can be hundreds of pages long and combine results from many specialisms, making them almost impossible to check. One solution is to deploy modern verification technology. Interactive theorem provers have demonstrated their potential as vehicles for formalising mathematics through achievements such as the verification of the Kepler Conjecture. Proofs done using such tools reach a high standard of correctness.

However, existing theorem provers are unsuitable for mathematics. Their formal proofs are unreadable. They struggle to do simple tasks, such as evaluating limits. They lack much basic mathematics, and the material they do have is difficult to locate and apply.

ALEXANDRIA will create a proof development environment attractive to working mathematicians, utilising the best technology available across computer science. Its focus will be the management and use of large-scale mathematical knowledge, both theorems and algorithms. The project will employ mathematicians to investigate the formalisation of mathematics in practice. Our already substantial formalised libraries will serve as the starting point. They will be extended and annotated to support sophisticated searches. Techniques will be borrowed from machine learning, information retrieval and natural language processing. Algorithms will be treated similarly: ALEXANDRIA will help users find and invoke the proof methods and algorithms appropriate for the task.

ALEXANDRIA will provide (1) comprehensive formal mathematical libraries; (2) search within libraries, and the mining of libraries for proof patterns; (3) automated support for the construction of large formal proofs; (4) sound and practical computer algebra tools.

ALEXANDRIA will be based on legible structured proofs. Formal proofs should be not mere code, but a machine-checkable form of communication between mathematicians.

So what have we done?

I've written a survey of our formalisation work. Angeliki's Topos Institute talk "The Era of Formalised Mathematics" outlines the historical background as well as the project's achievements. The ERC website has the full list of publications. Some highlights:

The SErAPIS search engine, which uses a combination of formal syntax and mathematical terminology to search the 700+ entries and nearly 4 million lines of proofs in the Isabelle libraries
The first machine proof that every field admits an algebraically closed extension (Paulo Emilio de Vilhena)
The first machine formalisation of Green's theorem (Mohammad Abdulaziz)
Nine chapters of analytic number theory (Manuel Eberl)
Publications on counting real and complex roots of polynomials and evaluating winding numbers. A formal proof of the theorem of three circles (Fox Thompson). All with applications to computer algebra.
Our formalisation of quantum computation revealed a longstanding error in a highly cited journal article on quantum computing. (Hanna Lachnitt and Yijun He)
We took over half of a special issue of Experimental Mathematics with our work on
- Ordinal partitions (with Mirna Džamonja). Also a talk
- Irrationality and transcendence criteria
- Grothendieck's schemes (and we don't need dependent types!)
A follow-up paper on ordinal partitions in memory of Kenneth Kunen for the Annals of Pure and Applied Logic.
A paper on our formalisation of Szemerédi's regularity lemma and Roth's theorem on arithmetic progressions has been published by the Journal of Automated Reasoning.
Chelsea Edmonds' formalisation of Lucas's theorem with accompanying workshop paper and slides (this was her contribution to the effort to formalise the DPRM Theorem by Bayer et al.)
Her PhD work on formalising combinatorial design theory
Her formalisation of Fisher's inequality, with accompanying conference paper and slides
Her formalisation of the Lovász Local Lemma and probabilistic proof methods, with accompanying conference paper and slides
Additional formal developments, including Projective geometry • Octonions • Irrational numbers • Amicable numbers • Plünnecke-Ruzsa inequality • Undirected graph theory • Turán's graph theorem • Kneser's theorem and the Cauchy-Davenport theorem • Khovanskii's theorem
A formalisation of Wetzel's problem, with accompanying conference paper
We formalised about half of a number theory graduate textbook, Modular Functions and Dirichlet Theory (slides and 20-minute video)
The Balog-Szemerédi-Gowers theorem, with the slides and paper for CPP 2023
Anthony Bordg on how to do maths without dependent types: slides and his CPP 2023 paper.
Two more talks given by Anthony on autoformalisation: our work on a parallel corpus and on machine learning
From our extended team, lots of other work in progress on machine learning. A foretaste here. The paper Draft, sketch, and prove: guiding formal theorem provers with informal proofs was presented at ICLR 2023. See also the posters on Thor and Draft/Sketch/Prove
AFP entries by team member: Bordg • Edmonds • Koutsoukou-Argyraki • Li • Paulson
Much more in the pipeline!

We have formalised a wide selection of results across the mathematical landscape: combinatorics, analysis, number theory, Ramsey theory. We've formalised the work of two Fields medalists (Roth and Gowers) an Abel prize winner (Szemerédi) and the legendary Paul Erdős too. For background, commentary, examples, techniques and philosophical issues, see my Machine Logic blog!

The project ran for 72 months starting 1 September 2017.

Last revised: Wednesday, 7 February 2024

Lawrence C. Paulson • Computer Laboratory • University of Cambridge