Return-Path: <John.Harrison-request@cl.cam.ac.uk>
Delivery-Date: 
Received: from leopard.cs.byu.edu (no rfc931) by swan.cl.cam.ac.uk 
          with SMTP (PP-6.5) outside ac.uk; Wed, 25 Jan 1995 23:29:35 +0000
Received: by leopard.cs.byu.edu (1.38.193.4/16.2) id AA09072;
          Wed, 25 Jan 1995 16:21:42 -0700
Sender: info-hol-request@lal.cs.byu.edu
Errors-To: info-hol-request@lal.cs.byu.edu
Precedence: bulk
Received: from SIMON.CS.CORNELL.EDU by leopard.cs.byu.edu 
          with SMTP (1.38.193.4/16.2) id AA09068;
          Wed, 25 Jan 1995 16:21:40 -0700
Received: from cloyd.cs.cornell.edu (CLOYD.CS.CORNELL.EDU [128.84.227.15]) 
          by simon.cs.cornell.edu (8.6.9/R1.01) with SMTP id SAA03504;
          Wed, 25 Jan 1995 18:18:43 -0500
Received: from DENEB.CS.CORNELL.EDU by cloyd.cs.cornell.edu (5.67/I-1.99D) 
          id AA24182; Wed, 25 Jan 95 18:18:49 -0500
Received: (murthy@localhost) by deneb.cs.cornell.edu (8.6.9/C1.00) id SAA05527;
          Wed, 25 Jan 1995 18:18:40 -0500
Message-Id: <199501252318.SAA05527@deneb.cs.cornell.edu>
To: John Harrison <John.Harrison@cl.cam.ac.uk>
Cc: info-hol@leopard.cs.byu.edu, murthy@cs.cornell.edu
Subject: Re: Difficulties with large terms
In-Reply-To: Your message of "Wed, 25 Jan 1995 21:30:19 GMT." <"swan.cl.cam.:186310:950125213211"@cl.cam.ac.uk>
Date: Wed, 25 Jan 1995 18:18:40 -0500
From: Chet Murthy <murthy@cs.cornell.edu>


>>>>> "JH" == John Harrison <John.Harrison@cl.cam.ac.uk> writes:

    JH> Yes, which makes "stress-testing" it rather instructive. For
    JH> example, I sometimes wonder about the efficiency of de Bruijn
    JH> terms when the terms are (a) very large and (b) rich in nested
    JH> abstractions. Simply traversing the term, regardless of what
    JH> you actually do, could then be quadratic, because each
    JH> dest_abs stimulates a subtraversal replacing the dB index with
    JH> a free variable (lots more consing too...) On the other hand I
    JH> suppose situations where (a) and (b) are both true are not
    JH> easy to imagine.

I did exactly this -- compared explicit names with deBruijn numbers,
and, indeed, dB numbers were *much* faster on almost all the examples.

I did it on the entire corpus of examples of the Coq system, after
having modified, in a systematic manner, the entire Coq system to use
explicit names instead of dB numbers (at the time, I "believed" in
explicit names).

The slowdown with explicit names was more than a factor of 4.  *But*
the gap seemed to decrease with the size of problem (in Coq sometimes
there are problems which truly are huge, since Coq builds a
"proof-term" which is type-checked as a unitary whole).

I never checked (scientifically, in a controlled experiment) the
performance relation as a function of size, but I had the distinct
impression that the gap closed as terms got bigger.

But John's example (it seems to me) is only going to work if you
cache, the free-variables of terms at some of the abstractions.

Then, you could short-circuit, for instance, substitution, by noticing
that the free-variables of the body of an abstraction, and the domain
of the substitution, were disjoint.

But I suspect that if we put the same energy into doing the same sorts
of caching for deBruijn numbers, we could achieve the same effect, *on
top* of the already superior speed for smaller problems.

Of course, I haven't verified this, either.

--chet--
