Automation for Interactive Proof

EPSRC logo

Funded by the EPSRC (January 2004 to January 2007). Grant reference GR/S57198/01. Value £249,905

This project delivered the popular Sledgehammer tool, linking Isabelle with automatic theorem provers and allowing users to make progress with a single click.

Project Members

Background to the Research

Interactive proof tools support rich, expressive formalisms. The user can express a complex model of, say, a CPU design, and develop its formal theory. Unfortunately, interactive proof construction is tedious. At the opposite extreme, resolution systems are automatic and powerful. However, they are restricted to first-order logic with equality. Can we combine the advantages of these two technologies?

This proposal aims to combine interactive proof tools with a leading resolution theorem prover. It also abandons the traditional mode of interaction, where the proof tool does nothing until the user types a command. Background processes should try to prove the outstanding subgoals. Better automation could make formal verification affordable.

Vampire is one of the world's leading resolution theorem provers. Isabelle is a popular interactive proof tool that supports several formalisms, including higher-order logic and ZF set theory.

The Project Objectives are

  1. to give interactive proof tools greatly improved automation
  2. to develop the concept of an interactive proof tool using background processing
  3. to explore the formal relationships between first- and higher-order logic, from the perspective of mechanical theorem proving


The workplan is divided into a series of tasks:

  1. First experiments. The objective is to make a resolution prover do proofs previously performed interactively. The procedure is to take existing Isabelle proofs and attempt to reproduce them using Vampire.
  2. First experiments (equality). The objective is to exploit a resolution prover's treatment of equality. The procedure is to identify Isabelle proofs that require a combination of equality and classical reasoning, attempting to reproduce them using Vampire.
  3. Basic functionality (untyped). The objective is to implement a tactic that transfers a subgoal and its context from a proof assistant to a resolution prover and reports the outcome. The procedure is to write code to automate the steps performed manually in the previous two tasks. We shall use Isabelle/ZF, which implements set theory, in order to avoid complications involving types.
  4. Basic functionality (typed). This extends the previous task to handle types. The objective is to obtain sound reasoning in a simply typed formalism using an untyped resolution prover. We shall use Isabelle/HOL, which implements higher-order logic.
  5. Full modelling of Isabelle types. This extends the previous task to handle order-sorted polymorphism. The procedure is to encode the relevant concepts, such as type classes, in first-order logic. (Order-sorted polymorphism originated with the programming language Haskell.)
  6. New reductions to first-order logic. The objective is to increase the scope of resolution by introducing translations from higher-order logic into first-order logic. Most proof tools allow variable binding within terms, but resolution provers do not.
  7. Transparency. The objective is to minimize the requirement for the user to trust the resolution prover. Ideally, this requires automatically translating the resolution proof into an Isabelle one. A preliminary, minimal objective would be for Vampire to return the precise list of lemmas used.
  8. Transferring the technology. The objective is to show that the techniques developed above are general, and not restricted to specific tools. We intend to link the HOL system to Vampire and perhaps to other resolution provers.
  9. Performance refinements. Even when all functionality has been achieved, tuning will be needed in order to obtain high performance.

Lawrence C. PaulsonComputer LaboratoryUniversity of Cambridge