Notes on linking HOL and ACL2


Contents Comments and questions are in red and highlighted like this.

 

Goals

The plan is to provide a semantically coherent link between the higher order logic supported by HOL4 and the logic and programming language supported by ACL2.

ACL2 has a number of features that make it highly suitable for system modelling. Our goal is to enable HOL users to employ ACL2 as a trusted oracle. A specific case study we have in mind is to experiment with ACL2 for executing the verified ARM model due to Anthony Fox.

Background

In 1999 Mark Staples implemented a tool for called ACL2PII for linking ACL2 and HOL. This was built with an early version of the Prosper Toolkit. It was quite successful, and was used by Susanto as part of his AMBA-ARM7 Formal Verification Platform.

ACL2PII is our starting point. However, the Prosper Toolkit, though influential and successful in its time, is no longer supported and HOL methodology has moved on. In particular, Konrad Slind has been developing a polytypism methodology for systematically processing and translating higher order logic terms. One (of several) applications is the translation from higher order logic to ML programs, so that ML can be used as an execution oracle for ground HOL terms.

One of the tricky things about translating from the untyped ACL2 logic to typed higher order logic is figuring out which types to assign. An example Staples gives is that, depending on the context, the ACL2 S-expression NIL might need to be translated to F (falsity of HOL type bool), or [] (the empty list) or NONE (the empty option). The ACL2PII user has to set up "translation specifications" that are pattern-matching rewrite rules to perform the ACL2 to HOL translation. These are encoded in ML and are thus outside the fully-expansive LCF methodology (and so may introduce inconsistency).

Inspired by Konrad's polytypism methodology, it is hoped to use a deeper linking, with families of semantic functions defined in higher order logic to perform the coding/decoding bewtween higher order logic and the ACL2 S-expressions. These Slind-style translation functions will be based on a semantics of the ACL2 logic in HOL (deep embedding) and should enable some formal assurance that the translations between the logics is sound. In Appendix A of Linking ACL2 and HOL, Staples gives the following definition of ACL2 S-expressions in HOL:

val packagename = Type`:string`;
val name        = Type`:string`;

Hol_datatype
 `sexp = ACL2_SYMBOL    of ^packagename => ^name
       | ACL2_STRING    of string
       | ACL2_CHARACTER of ascii  [should this be the type char now?]
       | ACL2_NUMBER    of num
       | ACL2_PAIR      of sexp => sexp`;
Matt Kaufman tells me that numbers in ACL2 are complex rationals (complex numbers with rational realpart and imagpart) and characters (i.e. ACL2_CHARACTER) lie in the range [0,256). Thus the type above this isn't quite right.

The next step is to define an EVAL function that gives the meaning of the ACL2 logic (as evaluation may not terminate, this may need to be represented as an inductively defined relation). We then want to set up type-indexed families of translation functions from HOL values into values of type sexp, and to somehow have a mechanism for generating -- maybe "on the fly" -- theorems relating EVAL to the semantics of the encoded values. Exactly how to do this is very fuzzy to me, especially how much can be done inside the logic and what has to be scripted in the metalanguage ML. (Konrad, what do you think? I suspect you've "been there, done that"!)

There is one piece of background that I have that may, or may not, turn out to be useful. This is the (somewhat unsuccessful) work I did to translate between untyped ZF set theory and higher order logic. The approach I used was, like Staples', quite "shallow" (i.e. performed by metalanguage scripts rather than by proof). This endeavour floundered because the logical foundation of the translation scripts wandered into parts of set theory where I was out of my depth. I came away from the exercise feeling that it needed to be done more inside higher order logic, so as to ensure the various logical transformations were formally validated by HOL. This colours my reading of Staples' more successful ACL2PII work, and is one reason why I want to try a "deeper embedding" approach.

Several people have looked into linking classical and constructive higher order logics. For example, Encoding the HOL Light logic in Coq - Freek Wiedijk, Importing Isabelle Formal Mathematics into NuPRL - Naumov (1999) and Semantic foundations for embedding HOL in Nuprl - Howe - 1996. These and similar papers tend to have scary math, but there might be some insights lurking in them.


Email input

Below are various useful emails (thanks guys!) which I'm stashing here (after minor editing to remove non-relevant content) so that I don't lose track of the stuff. Hope eventually to fold the information in the emails into the evolving project description above (when I do that I'll probably delete the emails).

Matt Kaufmann

Date: Tue, 27 Jul 2004 08:03:33 -0500 Message-Id: <200407271303.i6RD3X1S011976@craigievar.cs.utexas.edu> From: Matt Kaufmann To: Mike.Gordon@cl.cam.ac.uk CC: matthews@galois.com, Mike.Gordon@cl.cam.ac.uk, slind@cs.utah.edu Subject: Re: HOL/ACL2: some very first thoughts Hi, Mike -- [stuff deleted] Regarding: Hol_datatype `sexp = ACL2_SYMBOL of ^packagename => ^name | ACL2_STRING of string | ACL2_CHARACTER of ascii [should this be the type char now?] | ACL2_NUMBER of num | ACL2_PAIR of sexp => sexp`; As Staples points out, numbers in ACL2 are rationals, so this isn't quite right. Actually, numbers in ACL2 are complex rationals (complex numbers with rational realpart and imagpart). As for ACL2_CHARACTER, I couldn't remember whether there are only 256 characters! Presumably http://www.cs.utexas.edu/users/moore/publications/km97a.ps.gz answers this, but anyhow I checked the source code and found that indeed, we have functions that establish a bijection between the characters and the integers in [0,256): (defaxiom char-code-linear ; The other properties that we might be tempted to state here, ; (integerp (char-code x)) and (<= 0 (char-code x)), are taken care of by ; type-set-char-code. (< (char-code x) 256) :rule-classes :linear) (defaxiom code-char-type (characterp (code-char n)) :rule-classes :type-prescription) (defaxiom code-char-char-code-is-identity (implies (force (characterp c)) (equal (code-char (char-code c)) c))) (defaxiom char-code-code-char-is-identity (implies (and (force (integerp n)) (force (<= 0 n)) (force (< n 256))) (equal (char-code (code-char n)) n))) Finally, You mention: "theorems relating EVAL to the semantics of the encoded values". I expected that EVAL _defines_ the semantics. Perhaps I misunderstood the point here, which I thought would be that you would want to want something like: EVAL(acl2-term) = corresponding-HOL-term, where I'm over-simplifying by omitted an "assignment" argument of EVAL. -- Matt Date: Tue, 27 Jul 2004 09:52:42 -0500 Message-Id: <200407271452.i6REqgUA024017@sundance.cs.utexas.edu> From: Matt Kaufmann To: Mike.Gordon@cl.cam.ac.uk CC: matthews@galois.com, moore@cs.utexas.edu, Mike.Gordon@cl.cam.ac.uk, slind@cs.utah.edu Subject: Re: HOL/ACL2: some very first thoughts Hi, Mike -- By the way, regarding (again) "theorems relating EVAL to the semantics of the encoded values": I probably misunderstood. Now I'm guessing that your theorems related the application of EVAL on ACL2 terms to corresponding HOL terms, something like this for example: well_formed_acl2_assignment(s) => EVAL('+ x y')s = s(x)+s(y). Right?

J Strother Moore

Date: Tue, 27 Jul 2004 08:45:10 -0500 From: J Strother Moore To: kaufmann@cs.utexas.edu CC: Mike.Gordon@cl.cam.ac.uk Subject: Re: Mission Accomplished Hi Mike. That's an interesting project and I wish I were there to help you. It sounds fun. Feel free to ask us questions. I would definitely start with IF, EQUAL, CAR, CDR, CONS, and CONSP, the constants T and NIL, and recursive definition. You might want some other datatypes and besides the numbers (you might restrict your attention to the integers) I'd suggest the symbols. (INTEGERP x) recognizes ... -2, -1, 0, 1, 2, ..., in the sense that it returns T or NIL accordingly. (SYMBOLP x) recognizes symbols, including T and NIL. I'd formalize addition, for your purposes, as just (BINARY-+ i j) = (hol's-sum (ifix i) (ifix j)) where ifix is (ifix x) = (if (integerp i) i 0) In ACL2, BINARY-+ also adds rationals and complex numbers but defaults all non-numbers to 0. So the formalization above is wrong if non-integer numbers exist, but it's a good start. While there are lots of operations on symbols (e.g., what package is it in? how do you spell its print name?) I wouldn't formalize any. I would urge you just to have some symbols in the model so you can deal with common objects that arise in ACL2 proofs, like '(LOAD A B) is a list of three symbols that might be given semantics by an operational model of an assembly code. All that is important about LOAD, A and B are that they are distinct (and distinct from all lists and all numbers too). J

John Matthews

Cc: Matt Kaufmann , slind@cs.utah.edu From: John Matthews Subject: Re: HOL/ACL2: some very first thoughts Date: Tue, 27 Jul 2004 07:12:53 -0700 To: Mike Gordon X-Mailer: Apple Mail (2.618) X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ X-Cam-AntiVirus: No virus found X-Cam-SpamDetails: scanned, SpamAssassin (score=0.1, RCVD_IN_SORBS 0.10) Hi Mike, [stuff deleted] In scanning through Matt's reply, one thought did occur to me: We don't need to model all of ACL2, at least initially. My understanding is that the axioms underlying ACL2's universe of values allows for new classes of values to be added to the universe later on, i.e. they have an "open-world" assumption. So for example one could start by just modeling the integers, along with an INTEGERP predicate, and leave rationals, complex rationals, etc. unspecified. As time went on these other value classes could then be added consistently. Cheers, -john Return-path: To: Mike Gordon From: John Matthews Subject: HOL2ACL2 Date: Thu, 29 Jul 2004 09:10:31 -0700 Hi Mike, I've quickly read through your project blog. Cool idea. I'd like to comment on one paragraph there: | The next step is to define an EVAL function that gives the meaning | of the ACL2 logic (as evaluation may not terminate, this may need to | be represented as an inductively defined relation). Although Common Lisp may not terminate, ACL2's logic is one of total functions. In fact, ACL2's logic has a first-order Hilbert choice schema, called DEFCHOOSE. I think this makes ACL2 at least as strong as classical first order logic + PRA. Are you intending to translate into ACL2's logic, and have ACL2 check the translated forms, or are you going to use ACL2 in "program" mode, where ACL2 just assumes you know what you are doing and executes the translated functions? If you are intending to translate into ACL2's logic, then I think the EVAL function can be total, using HOL's Hilbert choice operator. That is (using pseudo-Isabelle syntax, sorry), EVAL_sexp :: ACL2_environment => sexp => sexp EVAL_fun_def :: ACL2_environment => sexp => (sexp => sexp) EVAL_fun_def env ``(defun f (x) (body (x))'' == (SOME f. ALL x. f x = EVAL_sexp (extend_environment ("x",x) env) ``body (x)'') In fact, now that I think about it I wonder if ACL2's DEFUN form for admitting well-founded recursive functions could be placed outside of the ACL2 kernel, in principle. I might ask the acl2-help mailing list about this. | We then want to set up type-indexed families of translation | functions from HOL values into values of type sexp, and to somehow | have a mechanism for generating -- maybe "on the fly" -- theorems | relating EVAL to the semantics of the encoded values. Exactly how to | do this is very fuzzy to me, especially how much can be done inside | the logic and what has to be scripted in the metalanguage | ML. (Konrad, what do you think? I suspect you've "been there, done | that"!) As you no doubt already know, the Isabelle theorem prover allows type-overloaded constant definitions. This would allow for defining a single overloaded ACL2 translation function, rather than requiring a whole family of type-specific translation functions. The key advantage is that now simple rewriting will automatically perform the translation, no special ML tactics are required. We don't actually need to extend HOL4's type system, we just need a more powerful constant-definition scheme, that allows different definitions of a constant at different types. It would have to check that each such definition is actually at a different type. Isabelle also allows these definitions to be primitive-recursive over the structure of types, by checking that every recursive occurrence of the constant being defined is to a simpler type. If the constant-definition scheme is incremental, that is it allows definitions at different types to be given over time, rather than all at once, then it turns out that you can get all of the power of Isabelle's axiomatic type classes, without extending the type system. The way it works is that one defines a polymorphic predicate stating the desired type-class axioms over a set of type-indexed constants. The type-indexed constants have been declared, but not yet defined. As the user assigns definitions to the constants at various types, they can prove that the type-class predicate holds of the constant at those types. General theorems that hold over all instances of the type class can be proved one-and-for-all by making them conditional on the type-class predicate. Here is an example of these ideas in the context of HOL2ACL2. First, you can define a polymorphic predicate corresponding to an "Embeddable" axiomatic type-class, stating whether a given type can be embedded in ACL2's value universe, by first declaring (but not yet defining) two type-indexed constants: consts embed_in_sexp :: 'a => sexp project_from_sexp :: sexp => 'a The type-class predicate is now just constdefs Embeddable :: 'a => bool Embeddable (x::'a) == (ALL (y::'a). project_from_sexp (embed_in_sexp y) = y The argument X to Embeddable is only used for it's type, to allow the boolean returned by Embeddable to depend on the type of X. Then you can start giving definitions of the constants embed_in_sexp and project_from_sexp at particular types, to make those types instances of the Embeddable type class: constdefs (overloaded) embed_in_sexp_num_def: embed_in_sexp (x::num) == ACL2_NUMBER x embed_in_sexp_num_def: project_from_sexp sexp == (SOME (y::num). sexp = embed_in_sexp y) (Actually, project_from_sexp doesn't need to be overloaded, it could simply have the above definition at all types. But proving that a type is Embeddable is often easier with type-specific definitions of project_from_sexp.) Here is another instance at the bool type. constdefs (overloaded) embed_in_sexp_bool_def: embed_in_sexp (b::bool) == ACL2_SYMBOL "ACL2" "t" project_from_sexp_bool_def: project_from_sexp sexp == (not (sexp = ACL2_SYMBOL "ACL2" "nil")) Now to show that the num and bool types are instances of Embeddable, you simply prove the following theorems: theorem num_embeddable: |- Embeddable (x::num) theorem bool_embeddable: |- Embeddable (b::bool) Using primitive type recursion, you can go on to show that pairs of Embeddable types are Embeddable. constdefs (overloaded) embed_in_sexp_pair_def: embed_in_sexp (x::('a * 'b)) == ACL2_PAIR (embed_in_sexp (fst x)) (embed_in_sexp (snd x)) project_from_sexp_pair_def: project_from_sexp sexp == (ACL2_CAR sexp, ACL2_CDR sexp) theorem prod_Embeddable: |- Embeddable x /\ Embeddable y ==> Embeddable (x,y) Now you can just use HOL4's rewriter to translate, for example project_from_sexp (ACL2_CONS (ACL2_CONS (ACL2_NUMBER 1) (ACL2_NUMBER 2)) (ACL2_SYMBOL "ACL2" "nil")) to ((1,2),false), using the collection of definitions for project_from_sexp and equalities between ACL2_CONS, ACL2_CAR, and ACL2_CDR. There are also tricks for controlling what type a given sexp should be translated to, namely using type-indexed coercion functions. For example, to ensure that project_from_sexp translates (ACL2_NUMBER 2) to a Boolean within a larger expression, we first define the coercion function constdefs force_bool :: sexp => sexp force_bool sexp == embed_in_sexp ((project_from_sexp sexp)::bool) Now the expression project_from_sexp (ACL2_CONS (ACL2_CONS (ACL2_NUMBER 1, ACL2_NUMBER 2)) (force_bool (ACL2_NUMBER 3))) rewrites to ((1,2),true), as desired. To try these ideas out quickly, we could start by simply axiomatizing the embed_to_sexp and project_from_sexp overloaded constants. Whenever someone gets around to implementing an overloaded constant definition facility in HOL4 then the axioms could be turned into theorems. In the meantime we could point to the Isabelle literature where the soundness of overloaded constant definitions is established. To summarize, I believe that type-indexed constant definitions allow polytypic theorem proving to be carried out entirely within the logic. Isabelle can already do it. It shouldn't be too hard to add the capability to HOL4. Whew, that was a mouthful! :) -john

Konrad Slind

Date: Mon, 02 Aug 2004 16:18:32 -0600 From: Konrad Slind To: Mike Gordon Subject: Re: HOL/ACL2: some very first thoughts Mike Gordon wrote: [stuff deleted] Hi Mike, Seems like a nice project. A couple of points: * ascii in the datatype should be char (and HOL-4 chars seem identical to ACL2 chars) * Is the following something like the scenario you are thinking of? - suppose there are functions "toACL2:term -> sexp" and "meaning:sexp->term" such that |- meaning (toACL2 (P)) = P is provable in HOL. toACL2 isn't definable in HOL and neither is meaning. However, we can mechanically construct them on an ad-hoc basis using the type of P (and prove the above theorem automatically too, I suppose). - Suppose that EVAL:sexp -> sexp has been defined in HOL (omitting environment and probable relational nature of EVAL for this discussion). We should be able to prove in HOL: |- meaning (sexp) = meaning (EVAL(sexp)) (This may also have to be a type-indexed proof.) - Given a goal G, we build the proposition "toACL2(G)", evaluate it to get an s-expression and send that to ACL2 for evaluation, getting back some expression H. We make the annotated theorem |- EVAL(toACL2(G)) = H and deduce |- meaning (EVAL(toACL2(G))) = meaning H By the theorem on EVAL we can get |- meaning (toACL2(G)) = meaning H Now meaning should invert toACL2, so we get |- G = meaning H and rewriting with the meaning function definition should lift the sexpression H to its HOL semantics, which should (hopefully) be comprehensible to the user. [stuff deleted] A final note: if it happens that ACL2 returns a function value, then there's the question of how to translate it back tp HOL. There's some literature on this, sparked by a paper by Berger and Schwichtenberg on "normalization-by-evaluation". Paper is attached (pretty technical IMO) "An inverse of the evaluation functional for typed lambda calculus", LICS'91, pp203-211. [attachment deleted (available from Ulrich Berger's homepage)] Cheers, Konrad.

Page maintained by Mike Gordon.
Version 1.04, Mon Aug 2, 2004