Delivery-Date: 
Sender: info-hol-request@lal.cs.byu.edu
Errors-To: info-hol-request@lal.cs.byu.edu
Precedence: bulk
Message-Id: <199501252318.SAA05527@deneb.cs.cornell.edu>
To: John Harrison <John.Harrison@cl.cam.ac.uk>
Cc: info-hol@leopard.cs.byu.edu, murthy@cs.cornell.edu
Subject: Re: Difficulties with large terms
In-Reply-To: Your message of "Wed, 25 Jan 1995 21:30:19 GMT." <"swan.cl.cam.:186310:950125213211"@cl.cam.ac.uk>
Date: Wed, 25 Jan 1995 18:18:40 -0500
From: Chet Murthy <murthy@cs.cornell.edu>


>>>>> "JH" == John Harrison <John.Harrison@cl.cam.ac.uk> writes:

    JH> Yes, which makes "stress-testing" it rather instructive. For
    JH> example, I sometimes wonder about the efficiency of de Bruijn
    JH> terms when the terms are (a) very large and (b) rich in nested
    JH> abstractions. Simply traversing the term, regardless of what
    JH> you actually do, could then be quadratic, because each
    JH> dest_abs stimulates a subtraversal replacing the dB index with
    JH> a free variable (lots more consing too...) On the other hand I
    JH> suppose situations where (a) and (b) are both true are not
    JH> easy to imagine.

I did exactly this -- compared explicit names with deBruijn numbers,
and, indeed, dB numbers were *much* faster on almost all the examples.

I did it on the entire corpus of examples of the Coq system, after
having modified, in a systematic manner, the entire Coq system to use
explicit names instead of dB numbers (at the time, I "believed" in
explicit names).

The slowdown with explicit names was more than a factor of 4.  *But*
the gap seemed to decrease with the size of problem (in Coq sometimes
there are problems which truly are huge, since Coq builds a
"proof-term" which is type-checked as a unitary whole).

I never checked (scientifically, in a controlled experiment) the
performance relation as a function of size, but I had the distinct
impression that the gap closed as terms got bigger.

But John's example (it seems to me) is only going to work if you
cache, the free-variables of terms at some of the abstractions.

Then, you could short-circuit, for instance, substitution, by noticing
that the free-variables of the body of an abstraction, and the domain
of the substitution, were disjoint.

But I suspect that if we put the same energy into doing the same sorts
of caching for deBruijn numbers, we could achieve the same effect, *on
top* of the already superior speed for smaller problems.

Of course, I haven't verified this, either.

--chet--