Researcher in Artificial Intelligence
EPSRC Advanced Research Fellowship (2002-2012)
In this project, I aim to model some powerful human-oriented reasoning techniques on machines, in particular reasoning with diagrams and learning from examples. Human mathematicians often informally use diagrams in order to better convey solutions to problems. Yet, diagrams have only seldom been used as formal tools for reasoning - symbolic logic is the predominant method of automated reasoning. I propose to formalise the use of diagrams in flexible architectures which allow users (in the case of interactive systems) and systems (in the case of completely automated systems) to reason both purely diagrammatically, and with a combination of diagrammatic and symbolic inferencing methods. The second informal human reasoning technique I propose to model is the process of learning a general concept from specific examples. The rare existing research into mechanising reasoning systems that can learn from examples only addresses symbolic reasoning. I propose to combine the use of diagrams and learning from examples by devising a system that can reason with diagrammatic, symbolic, and combined diagrammatic and symbolic methods (the so-called heterogeneous methods), and can moreover learn such methods automatically from well-chosen examples. The hope is that ultimately such a system will be able to automatically discover new, interesting and intuitive solutions to problems.
We encounter mathematics in every aspect of our lives. Some of the deepest and greatest insights into reasoning were made in mathematics. Hence, it is not surprising that emulating such powerful reasoning on machines is one of the important and difficult aims of artificial intelligence and automated reasoning. Humans often use diagrams to better convey problems and their solutions, and to generate intuitive and easily understandable solutions. They are also often able to learn general solutions from examples of solutions to related problems. Moreover, when they encounter similar problems, they may reason by analogy or symmetry with another already known solution to a similar problem in order to solve new ones. My research interests are in the exploration of the nature of such informal reasoning. This type of human reasoning is very powerful, yet its potential has largely not been exploited in the design of mechanised reasoning systems (i.e., systems which use some logic formalism to (semi-) automatically solve problems). There are state of the art systems which can use diagrams, but they do not do diagrammatic reasoning - on the contrary, they search for symbolic proofs [11,1,2]. There are also systems which can learn patterns, but these patterns are expressed as symbolic expressions with simple structures - they cannot learn complex structures such as algorithms or programs [27,31,26,35].
In the past, it has been claimed that informal reasoning is fundamentally human, and that machines are incapable of it (e.g., see Penrose's ``Shadows of the Mind'' [30]). I disagree with this claim - my aim is to formalise and emulate human reasoning with diagrams and human learning on machines, in order both to understand the human reasoning better and to make the machines more capable of solving complex problems of mathematics.
More specifically, the idea for this project is to combine some of the informal aspects of human reasoning in a single framework. My main objective is twofold. First, I aim to build a system which is capable of reasoning with diagrams and with a combination of diagrams and symbolic formulae in the same proof attempt. Proofs constructed from formulae and diagrams are referred to as heterogeneous proofs. Second, this system will be able to learn general reasoning methods in the proof from examples of specific ones. These methods can either be diagrammatic or heterogeneous (notice that I do not specifically aim to learn symbolic proofs, as this has been tackled in my earlier work [21]). Such formalised ``informal'' reasoning can then be used to complement existing symbolic proof methods.
Rather than learning low-level proofs, I aim for a system that can learn diagrammatic proof plans. Proof planning [6] is an approach to theorem proving which uses high-level proof methods rather than low-level logical inference rules to find a proof of a conjecture at hand. It is a technique that searches for a proof plan, and a proof plan consists of some combination of proof methods. A proof method specifies and encodes a general reasoning strategy that can be used in a proof, and typically represents a number of individual inference rules. For instance, mathematical induction can be represented as a proof method that consists of first considering a base case and then a step case of the theorem at hand. Our diagrammatic proof plans will be formed from diagrammatic proof methods, i.e., combinations of various geometric operations on diagrams. The heterogeneous proof plans will be formed from geometric operations plus symbolic inference steps. The system will be able to learn such proof methods from examples of the use of lower-level methods, and eventually, it will be able to learn new diagrammatic and heterogeneous proof plans. The hope is that ultimately learning new, general and complex proof methods and proof plans may lead to the discovery of new and interesting proofs of theorems of mathematics that use diagrams for inferencing.
Below are two examples which demonstrate the type of proof methods and proofs that we aim to learn and construct automatically. They also demonstrate the distinction that I make between diagrammatic and symbolic reasoning. A general definition of a diagram or a diagrammatic representation in a computer has been a subject of many debates, but no consensus has been reached [7]. In this project the notion ``diagrammatic'' will be adopted for any machine representation or operation on it that depicts or makes use of some spatial property of the diagram that it represents.
Figure 1 presents an example of a diagrammatic proof [28] of a theorem concerning the sum of odd naturals n^{2}=1+3+5+...+(2n-1).
The proof consists of repeatedly applying lcuts to a square (an lcut removes an ell shape which is formed from two adjacent sides of a square - see Figure 1. Notice that an ell represents an odd natural number since both sides of a square of size n are joined (2n), but the joining vertex was counted twice (hence 2n-1). This proof could be represented as one proof method of repeated applications of lcuts, where an lcut could be a separate proof method consisting of removing adjacent row and column in a square.
Here is another example, this time of a heterogeneous proof that
consists of a combination of symbolic and diagrammatic inference
steps. The theorem states an inequality: where
a,b >= 0. The first few symbolic steps of the proof are:
The second part of the proof [28], which is presented in Figure 2, shows diagrammatically the inequality a^{2}+2ab+b^{2} >= 4ab.
Automated reasoning systems have their roots back in the fifties when the first programs were written that could automatically prove simple theorems of propositional logic. As a result of growing interest in the research on automated reasoning we have today many sophisticated systems such as the theorem prover of Boyer and Moore [4] or Isabelle [29] in which one can prove complex theorems of mathematics.
However, during all these years, perhaps due to the influence of axiomatic logic, the majority of researchers have concentrated their efforts on improving the exact, rigorous and formal proof searching algorithms for a particular formal system of logic. In their efforts they have neglected the beauty and power of informal, intuitive reasoning of human mathematicians. There are exceptions, including work by Gelernter [11] and Bundy [5].
Bundy and his group developed a proof planning approach [6] to proving theorems which models human reasoning by searching for abstract high-level proof plans rather than sometimes unintuitive symbolic logical proofs of theorems. A proof plan is a collection of proof methods, and a proof method is also a collection of other, perhaps lower level, methods with pre- and postconditions. A proof plan looks much like a mathematician's high-level proof, namely it is a specification of a proof. Bundy has argued that in order to progress in computational logic, we need to go further and consider these informal aspects of human reasoning [5].
My work supports this argument. In the past, I have been investigating how informal human reasoning can be modelled on machines. In this project, I aim to develop a system which can reason with diagrams as well as with formulae, and can learn general proofs from examples of specific ones.
Roughly, diagrammatic reasoning systems are computer programs which use a diagram to aid the search for the solution of some problem. The first such program was Gelernter's Geometry Machine [11]. Others share much with Gelernter's Geometry Machine, for example, Grover [1] uses the diagram to prune the search for the essentially symbolic proof. The Diagram Configuration model [22] consists of derived rules about geometrical facts which are used to construct the search space if the problem is related to the diagram used in the rule. These derived rules are similar to our proof methods. In Hyperproof [2] the user can construct proofs by using first-order predicate logic rules and also the diagrammatic rules derived from the diagram situations in a blocks world. However, unlike the other systems mentioned so far, the diagrammatic inference steps from diagrams to symbolic expressions and vice versa can be made. This is similar to my proposed work whereby I aim to combine symbolic and diagrammatic reasoning steps. Apart from Hyperproof, some of the other systems which are related to my approach in that they use geometric manipulations of diagrams for inferencing include Archimedes [24] and Bitpict [10], however, they provide no guarantee that the proofs are correct.
The systems mentioned so far either use diagrams in the search for symbolic proofs, or construct diagrammatic proofs that are not guaranteed to be correct, i.e., they could still be found to be erroneous. At the other extreme of the research in diagrammatic reasoning is the effort to formalise the logic of diagrams [33,12], and the proofs are hence essentially symbolic.
My previous work on [14] broke new ground in automated reasoning and diagrammatic reasoning. can construct proofs with diagrammatic reasoning steps only, and uses logical machinery to show that the constructed diagrammatic proofs are formally correct.
Hence, the time is ripe to look into combining the existing diagrammatic reasoning methods with the symbolic ones in a single framework which will still guarantee to construct a correct formal proof of a theorem - this is what I aim to do in the proposed project. Furthermore, my objective is also to enable a system to learn from examples of similar reasoning patterns - the existing research on this topic is described next.
Some work has been done in the past on applying machine learning techniques to theorem proving, in particular on improving the proof search [9,32]. However, not much work has concentrated on high level learning of structures of proofs and extending the reasoning primitives within an automated reasoning system.
Silver [34] and Desimone [8] used precondition analysis to learn new method schemas, which are similar to our proof methods. Kolbe, Walter, Melis and Whittle have done related work on the use of analogy [25] and proof reuse [23]. Their systems require a lot of reasoning with one example to reconstruct the features which can then be used to prove a new example. The reconstruction effort needs to be spent in every new example for which the old proof is to be reused.
I have developed [21,20] an approach to learning semi-automatically a symbolically expressed proof method. In this proposed project I will need to investigate whether this learning approach can be applied to learning diagrammatic and heterogeneous proof methods.
To summarise, in my past research I explored reasoning with diagrams [14,13,18], but not by applying proof planning techniques to proof search, and not within an environment that allows for construction of heterogeneous proofs. I also devised a basic framework for learning symbolic proof methods from examples of proofs [21,20], but not in diagrammatic reasoning. These encouraging preliminary results provide me with the right system, framework and techniques to form the basis for the proposed project.
The aims of this proposed project are to:
This project is novel, ambitious and open ended, but with a clear starting point. It addresses issues, and uses techniques and results from several fields of artificial intelligence including automated reasoning, machine learning and cognitive science. At the same time it requires and develops specialised knowledge of diagrammatic reasoning, proof planning, theorem proving and machine learning. In addition to the theoretical results about the nature of human informal reasoning in mathematics, the result of the project will be a semi- or fully-automatic reasoning system which can prove theorems diagrammatically, learn new proof plans, and hopefully discover new diagrammatic proofs of mathematical theorems.
The success of the project will be evaluated by measuring the extent by which the aims of the project listed above have been achieved. These will be achieved by realising the plan that I describe next - the research can be divided into five main work-packages. The first three are fairly clearly defined, hence, it is possible to estimate the time that they will take to achieve. The last two are more open-ended, and hence the time that individual tasks will take are more difficult to predict - the research will show the more precise direction that I will take.
A starting point for this project is to choose a problem domain with a wealth of theorems that can be represented and proved diagrammatically. For instance, in my past work I considered problems of discrete space (i.e., the natural numbers), like the example in Figure 1. A potential additional domain are problems of continuous space (i.e., the real numbers), like the one in Figure 2. A set of possible diagrams and diagrammatic operations needs to be identified next. Clearly, I will not be able to study all possible types of diagrams, but some candidates are diagrams of discrete space (see Figure 1) and continuous geometric diagrams (see Figure 2). I should define a mapping relation between these diagrams and sentential formulae (e.g., a square can be mapped to ), and these operations and formulae (e.g., an lcut can be mapped to an inference rule n^{2}==> (2n-1)+(n-1)^{2}). The diagrammatic operations need to be represented as proof methods so proof plans can be formed from them. To begin with, only basic operations are required, such as combining rows and columns. More complex operations will be learnt by the system automatically.
The next step in formalising the proposed reasoning framework is to construct examples of diagrammatic proofs from which new proof methods will be learnt. The conjectures can be input either as sententially represented theorems, by the user, or interesting combinations of diagrams can be constructed automatically by the system (see work-package IV). A mapping relation is used to map the sententially represented theorems into diagrammatic representation. Next, the available diagrammatic proof methods are applied in order to construct proof plans. Initially, these can be input by the user. Later, their selection will be guided by the existing proof planning techniques.
Tasks:
Some proofs cannot be expressed entirely diagrammatically for a number of reasons. For example, it is difficult to express quantification, negation, disjunction, or taking square roots (see the inequality theorem given earlier). Theorems whose proofs consist of such features need to be identified (e.g., geometry, category theory, algebra). Next, we have to consider how diagrammatic proof steps can be combined with symbolic ones in a uniform proof system. The proposed system needs to have a guarantee that such heterogeneous proofs are correct. Both, diagrammatic and symbolic, proof methods should be treated in the same way by the proof planning engine.
Tasks:
Learning new diagrammatic methods is important for the reasoner's ability to prove theorems. For instance, consider again the theorem about the sum of odd naturals in Figure 1. If an lcut was not available, then the system would not be able to prove this theorem diagrammatically. Thus, this proof method needs to be made available to the user or the system which constructs the proof. The aim is that the proposed system will learn such a method automatically. In this example, an lcut proof method could be learnt from examples of proofs that use the two basic operations, namely the cut of a row and a column which are joined to form an ell.
In the development of the learning architecture I need to address the issue of how diagrammatic proof methods can be represented so that the learning is made easier. That is, how much information needs to be abstracted from the method representation so that the learning techniques can be applied, and how this information can be restored. Furthermore, the learning algorithm that will be required in this project needs to meet at least the following requirements: it should be able to deal with different styles of reasoning (e.g., backwards, forwards), and it should learn proofs with gaps, i.e., where parts of the proof are not known (for these parts, blind search is used). The learning approach described here aims to learn new higher-level proof methods on the basis of the already given ones. It cannot learn language extensions, i.e., it can only learn concepts that can be expressed within an existing language (e.g., if only splitting a row but not a column was defined within our diagrammatic language, then an lcut cannot be expressed and hence learnt). I propose to look into how language extensions could be learnt, which is an area that has been very little explored in the past.
The learning techniques that seem promising for use in this project include the algorithm that I developed in my past research, inductive logic programming, explanation based generalisation and learning programs from traces. I need to identify how ideas from these techniques can be used. For example, given a large set of proof traces, we could examine juxtapositions of most frequently used methods and learn how and in which context they are applied.
Tasks:
The result of work-packages I-III will be a system that will be able to learn new diagrammatic and heterogeneous proof methods. Using these methods, it is intended to find new and interesting proof plan of existing theorems, like the ones in Figures 1 and 2. The hope is that it may even be able to automatically discover new conjectures that we haven't considered before. By its very nature, this part is very open-ended and the least predictable in terms of the direction in which it will lead us.
An obvious first step in the construction of examples of proofs is to apply the available diagrammatic and symbolic proof methods using exhaustive search. Given simple initial diagrams, and a limited repertoire of geometric operations and symbolic proof methods, this may be tractable. Later, this search can be improved with the use of heuristics (e.g., certain known geometric properties indicate how to prune the search). The search can also be guided by the existing proof planning techniques. Furthermore, interesting combinations of diagrams can be constructed automatically by the system which may lead to a discovery of new conjectures. The user can provide some criteria for selecting interesting proofs and conjectures.
Tasks:
I will analyse the characteristics of different types of proofs (i.e., symbolic, diagrammatic and heterogeneous) with respect to, e.g., the search space required for each type, and length of each type of proof. Some cognitive analysis will be carried out in order to compare purely symbolic logical proofs with our diagrammatic or heterogeneous proofs. This will shed light on how humans do proofs.
While the implementation of the proposed system is an important dimension of this project, the cognitive analysis will be a novel input to the development process. I plan to collaborate with cognitive scientists (e.g., Alan Blackwell and his group in Cambridge) and hope to benefit from their expertise when carrying out this analysis.
Tasks:
The demand for automated reasoning techniques that can solve problems of system reliability, information retrieval, natural language understanding, computer-based teaching, etc. is increasing. The proposed research is ambitious and will respond to this demand in a novel and creative way. The drawbacks of the majority of existing mechanised reasoners (e.g., TPS, OMEGA, lambdaClam) are that they are complex and unintuitive - in order to use them, one needs to be a trained logician. My work will provide techniques which will lead to tools that will be more intuitive and easier to use. Constructing diagrammatic and heterogeneous proofs will make the portrayal and the communication of proofs in mechanised reasoning systems better and more accessible. The nature of this project is interdisciplinary, hence, various communities may be interested in my work. Automated reasoning and machine discovery communities will be interested in the formalisation of ``informal'' reasoning techniques that may be able to discover new knowledge. Machine learning community may be interested in my application of their techniques to automated reasoning. More intuitive proof systems and newly discovered conjectures and proofs will interest mathematicians. Philosophers and cognitive scientists will find of interest the cognitive issues related to the notion of proof and models of human reasoning. Automated proof systems are becoming increasingly present also in industry, which may benefit from the use of heterogeneous techniques to solve problems in a more intuitive and easily understandable way. An important application of this work is in mathematics education, which could be improved by applying my new techniques to the design of tools that enable mathematics educationalists to teach how to find more intuitive solutions to problems.
I will inform various communities mentioned above about my work through different channels. I will publish my results in the major journals (e.g., J. of Automated Reasoning, J. of Artificial Intelligence, J. of Logic and Computation) and conferences (e.g., IJCAI, CADE, IJCAR, ECAI, AAAI). Where possible, I will make my software publicly available over the web. The fellowship will also enable me to travel widely to British and other universities abroad. My existing close collaborations with the automated reasoning communities in Birmingham, Edinburgh, Cambridge, Saarbrücken (Germany), Trento (Italy) and the partners of the European Calculemus Network (of which I am a member as well) will be maintained through extended research visits. New collaborations, particularly with the cognitive science and human-computer interaction communities will be developed. Through visiting other universities and participating in summer schools and workshops, I plan to give several talks and courses (e.g., as I did for ESSLLI summer school 2000) about my work.
The first three work-packages are fairly clearly defined, so the tasks listed in these will be achieved in the first three years or so. The last two are less predictable, since the research is novel and hence by its nature carries some degree of uncertainty as to which direction I will take and how long will it take me to achieve the listed tasks. Each task represents a milestone for the proposed project.