WORKSHOP ON INTERACTIVE THEOREM PROVING (ITP)
Organised in Cambridge by Mike Gordon and Tobias Nipkow, 24-25 Aug., 2009
Post comments by sending email to: itpworkshop@googlegroups.com.
Subscribe to the group and then upload interesting papers.
Summary
It is hoped to produce a white paper reflecting the results of the
workshop (e.g. future needs, ideas for actions, collaborations). The
material below, which is expected to evolve, is derived from detailed
notes taken at the workshop by Matt Kaufmann, together with comments
that have been received. It is possible that some
participants may feel that topics have been incorrectly emphasised,
mentioned material is of low interest and important ideas have
been omitted. Please send Mike Gordon input so that further versions
can better reflect your views.
There is a striking agreement by almost everyone that interactive
theorem provers should become integrated design, execution and
verification platforms, with links to external tools. All the major
ITP systems (ACL2, Coq, Isabelle, HOL, PVS) already support this and
future plans emphasise the need for more research towards improving
the integration. External facilities that can (and are) being linked
include SAT and SMT solvers, proof search engines, model checkers,
tools for numerical methods and computer algebra systems. There is
also agreement that, if possible, external tools should provide proof
certificates that can be replayed or otherwise checked by the
importing ITP system.
It is especially remarkable that the major existing tools share a
common vision, because they are based on widely different software
architectures (e.g. extensible small kernel versus large pre-built tool
suite) and logical foundations (classical set theory versus
constructive type theory). Furthermore, they were developed within
largely disjoint technical cultures.
Embedded computation is not only useful for executing formal
specifications; it is required for proofs needing lemmas proved by
calculation (e.g. in the Four-Colour proof). A key component is having
a programming language as part of the term structure of the logic
supported by the theorem prover. Besides being used for computations
within proofs, executable terms are also used for checking examples
and counterexamples and for producing certified artifacts
(e.g. trustworthy data). All major current ITP tools support
computation to some extent, though they have quite different methods
of executing programs and moving between computation and deduction.
ITP systems have a diverse variety of uses that range from computer
system verification to artificial intelligence. Examples include:
assisting the creation and checking of mathematical proofs, design,
verification and optimisation of hardware and software, formal
synthesis of implementations from specifications, checking properties,
finding errors and counterexamples, planning and constraint solving.
Although full proof of correctness is often seen as the ultimate goal
of ITP applied to verification, simpler specification consistency
checking can have great value (e.g. type checking, shape analysis,
model checking, termination analysis etc). Experiments in seamlessly
linking these to theorem proving have been done, but further work
is needed, especially large case studies.
There are a number of new ways of interacting with theorem provers in
progress or being planned. These derive both from the way computing is
evolving (e.g. availability of efficient parallel execution thanks to
multi-core processors) and from the growing size of applications,
particularly checking huge mathematics proofs. However, the research
challenges of improving interaction may be tool and application
specific and may be orthogonal to ITP and logical issues. There is
some support for the idea of adopting modern software engineering
methods for managing large proofs, both system architecture ideas and
existing software development platforms (e.g. IDEs like Visual Studio,
Eclipse, NetBeans).
It is not clear, however, if off-the-shelf software development
interaction paradigms can be adapted for proof
development. Interactive proof may need new kinds of interaction -
some interesting ideas are already emerging. Hierarchical, graphical
tools can help with organising, understanding and constructing large
proofs. The Emacs-based Proof General front end is an existence proof
that tool- and application-independent ITP interfaces can be
successful, though opinions on the value and long-term prospects for
Emacs are divided.
Theorem proving has now reached a stage where it is possible to
perform (or attempt to perform) interactive proofs of very complex
existing mathematical results such as Goedel's Theorems, the
Four-Colour Theorem, the Kepler Conjecture (the Flyspeck project) and
the classification of finite simple groups. This application area is
putting demands on ITP tools to become more friendly to
mathematicians. Although this work is "blue sky" research, it may open
up new verification possibilities in software areas underpinned by
sophisticated mathematical theories (e.g. cryptography). Challenge
goals are: (i) having all of college mathematics formalised and
available off-the-shelf for use by theorem provers (an estimate of 140
person years was given); (ii) automating high school mathematics using
a combination of artificial intelligence and decision procedures. If
progress is made on these challenges there may be applications to
education (both generating course material and providing better
computer support for teaching and grading students).
The interaction needed to find proofs is seen as a positive creative
process and not as a necessary evil. Support for prototyping proofs,
like automatic generation of finite counter-examples and complex type
checking, are already useful. Exploring the duality between proving
and refuting may be a good route to getting more effective proof
finding interaction with tools. Currently only finite counter-examples
are supported, but there are new ideas emerging that might lead to
ways for automatically generating symbolic counterexamples in areas
such as continuous mathematics. Methods for visualising
counterexamples would also be useful. Since most programs and initial
conjectures contain errors, there was a general consensus that
counter-example finding deserves a higher priority than it has
traditionally enjoyed.
There is the possibility that ITP tools could aid in the creation of
new mathematics rather than just check old material. Perhaps some
kinds of future mathematics may only be possible with computer support
(by analogy with how computer algebra systems enable previously
intractable physics and engineering symbolic calculations). However,
the current logics supported by existing ITP tools are recognised as
not being attractive to mathematicians, so it will be hard to recruit
people from the mathematical community to use them. The pseudo-natural
language of Mizar and the outsourcing of Flyspeck to cheap smart
Vietnamese mathematicians are successful, but very different,
methodologies for engaging mathematicians. Neither of these approaches
are likely to work on a larger scale, especially as the wider
mathematical community is thought to be both sceptical and poorly
motivated towards computer tools.
For ITP to support large formal specification and verification
applications, real world specifications of, for example, accurate
semantics of widely used programming languages, networking protocols
or modern processor architectures need to be available. This is a
major challenge. Formal specifications are developed in a variety of
formalisms and currently it is hard to port these between systems, so
one can become "locked in" to a particular tool. If one has a project
needing to reuse specifications in the formalisms of different tools,
then complete formality of translation may be impractical and various
semi-formal pragmatic compromises may be necessary.
Current standards tend to be informal and (sometimes deliberately and
beneficially) ambiguous. Standards based on formal specification
methods could enhance the accuracy of models used for verification,
and benefits to the wider assurance certification community can be
envisioned, e.g. the possibility of reference specifications and
implementations being verified compliant to standards. A stepping
stone to standards supported by formal specifications would be a
better understanding of the semantical relationships between the
specification formalisms of different tools.
Semantically justified formal translations between theories may be too
demanding, but even tools just for translating definitions
(specifications) could be a useful. Translations would be aided by
recognised standard formats for different kinds of
specifications. Proposals were made for: (i) extending the SMT-LIB
format towards the kind of logics used for ITP, (ii) interchanging
theories between different ITP logics, and (iii) exchanging judgements
between ITP and automatic tools via a "toolbus". Such standard formats
would enable common repositories for specification, reference
implementations etc. to be better supported. A major challenge would
be to have a stardard representation for general mathematics. This is
much discussed, but there are no concrete proposals accepted within
the ITP community. Having reference examples and demonstrator case
studies in a standard format and in the public domain would provide a
poweful resource for students, researchers and evaluators.
The Internet and "cloud computing" provide new distributed
possibilities for creating mechanised theories. An inspiration is the
success of experiments in the mathematics community of collaborative
problem solving (e.g. Polymath). Could
projects be run like Polymath to accelerate the development of formal
theory infrastructure and also to stress existing tools? A clear
challenge is how to fit together contributions built using different
theorem proving software. Perhaps this could be seen as an add-on to
QPQ or linked to
the vdash concept.
Although the source code of all the major ITP tools is in the public
domain, the degree to which the software is a traditional 'open
source' development varies. More collaboration between different ITP
communities is advocated as a way to boost progress, and optimistic
signs are emerging.
Morning of Monday 24 August
Freek Wiedijk Three wishes
Alan Bundy A Hiproof Interface for Viewing and Constructing Proofs
Joe Hurd Theory Engineering: Proving in the Large
Andy Gordon Some Challenges for Future ITP
Afternoon of Monday 24 August
Mechanising mathematics: scientific challenge and potential applications
Laurent Théry Formalising Mathematics
Cameron Freer Mechanised Mathematics: Four Provocations
Rob Arthan
John Harrison Grumpy Old Man
Morning of Tuesday 25 August
Search for bugs and search for proofs (discussion lead by Tony Hoare)
Daniel Kroening SMT-LIB for HOL
Interfacing ITP to other tools and the real world
Daniel Kroening Interfacing ITP to the Real World
Peter Sewell Interfacing ITP to other tools and the real world
Tobias Nipkow
Shankar Interaction and Automation
Afternoon of Tuesday 25 August
Konrad Slind ITP Uses and Challenges at Rockwell Collins
Future dreams for current tools:
J Moore, Matt Kaufmann ACL2
Bruno Barras Coq
John Harrison HOL Light
Makarius Wenzel Isabelle
Shankar PVS
Mark Adams (mark AT proof-technologies.com)
Mihhail Aizatulin (avatar AT hot.ee)
Rob Arthan (rda AT lemma-one.com)
Bruno Barras (bruno.barras AT inria.fr)
Nick Benton (nick AT microsoft.com)
Alan Bundy (bundy AT staffmail.ed.ac.uk)
Cameron Freer (freer AT math.mit.edu)
Mohan Ganesalingam (mg262 AT cl.cam.ac.uk)
Georges Gonthier (gonthier AT microsoft.com)
Andy Gordon (adg AT microsoft.com)
Mike Gordon (Mike.Gordon AT cl.cam.ac.uk)
David Greaves (David.Greaves AT cl.cam.ac.uk)
John Harrison (johnh AT ichips.intel.com)
Tony Hoare (thoare AT microsoft.com)
Peter Homeier (palantir AT trustworthytools.com)
Joe Hurd (joe AT galois.com)
Paul Jackson (Paul.Jackson AT ed.ac.uk)
Cliff Jones (cliff.jones AT ncl.ac.uk)
Matt Kaufmann (kaufmann AT cs.utexas.edu)
Andrew Kennedy (akenn AT microsoft.com)
Daniel Kroening (kroening AT comlab.ox.ac.uk)
J Strother Moore (Moore AT cs.utexas.edu)
Magnus Myreen (Magnus.Myreen AT cl.cam.ac.uk)
Tobias Nipkow (nipkow AT in.tum.de)
Scott Owens (Scott.Owens AT cl.cam.ac.uk)
Matthew Parkinson (Matthew.Parkinson AT cl.cam.ac.uk)
Larry Paulson (Larry.Paulson AT cl.cam.ac.uk)
Andy Pitts (Andrew.Pitts AT cl.cam.ac.uk)
Claudio Russo (crusso AT microsoft.com)
Peter Sewell (Peter.Sewell AT cl.cam.ac.uk)
Konrad Slind (klslind AT rockwellcollins.com)
Natarajan Shankar (shankar AT csl.sri.com)
Georg Struth (g.struth AT dcs.shef.ac.uk)
Laurent Théry (Laurent.Thery AT sofia.inria.fr)
Thomas Tuerk (Thomas.Tuerk AT cl.cam.ac.uk)
Viktor Vafeiadis (viktorva AT mic rosoft.com)
Tjark Weber (tw333 AT cam.ac.uk)
Makarius Wenzel (wenzelm AT in.tum.de)
Freek Wiedijk (freek AT cs.ru.nl)
Please email Mike Gordon if there are any errors, or if you attended, but are not listed above and would like your name added.
Maintained by Mike Gordon.
This page last updated on September 15, 2009.