Delivery-Date: 
Sender: info-hol-request@lal.cs.byu.edu
Errors-To: info-hol-request@lal.cs.byu.edu
Precedence: list
To: info-hol@leopard.cs.byu.edu
Subject: Making some big memory savings in hol90
In-Reply-To: Your message of "Thu, 04 May 1995 14:35:06 +0200." <95May4.143513met_dst.8067@sunbroy14.informatik.tu-muenchen.de>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Wed, 17 May 1995 13:52:45 +0100
From: Donald Syme <Donald.Syme@cl.cam.ac.uk>
Message-Id: <"swan.cl.cam.:255280:950517125256"@cl.cam.ac.uk>


Fellow info-holler's,

After a discussion with John Harrison, I've been playing with 
techniques for reducing the memory usage of hol90.  The changes
outlined below seem to allow around a 60%-70% reduction in the
amount of memory used by theorems loaded in from theory files.
For a large set of libraries this could easily result in a 5-10MB
saving, when you include the fact that NJML more than doubles the heap.
This comes at a small cost: initial theory access is slower 
(about 25% slower), but if you use "save_hol" after setting up bindings
anyway, this should be OK.
Note the size of the core image of hol90 is *not* reduced, nor
is the amount of heap used when adding compiled code.

The changes involve keeping hash tables
of types, terms and subterms drawn from theory files.
Identical fragments of types and terms are shared between theorems
that come from theory files.  This reduces the 
amount of memory used and may mean equality tests
are faster (as they become pointer comparisions).  The only
thing slowed down is theory loading (not too bad if you use
"save_hol" anyway), which is slowed down by around 25%.

It turns out that about 50% of all theorem memory is taken
by duplicated type information, probably on polymorphic
constants like ! and ?.  

I've talked with Konrad and John Harrison about this, and they
seem to agree it's sound.  I guess a "proper" (vetted) version might
well appear in hol90.8, but since the savings are quite substantial 
I thought that people might be interested in adding the changes 
and recompiling their current copy of hol90.7.  If anyone wants
the code changes, email me and I will send them to you.
The files effected are:
	0/type.sml
	0/type.sig
	0/term.sml
	0/term.sig
	0/theory/thms.yak.sml

I've done alot of performance testing on different possible
applications of term/type shrinking.  For instance, it is possible to 
shrink every type and term created by the system against the current
storage table, but this will really slow down many basic operations.  I
have been making the "real" library to test speed differences, and
loading in other libraries for testing size differences.
The timing and memory statistics for most of the tests are
included below.

The best option seems to to only shrink all terms and types coming out 
of theory files, and *not* do any dynamic shrinking of terms and
types.  This seems to give a good tradeoff between speed and memory usage.

So, if anyone wants the code, let me know.  It needs to be improved
(e.g. the hash table size should change dynamically).

Cheers,
Don Syme

PS. I would imagine the same changes can be implemented in hol88
too, with similar savings.

PPS. I'm not planning to do any more work on this.

------------------- Timing & Memory Usage Results ------------------

Table 1: Different Tests conducted

	    shrink/     shrink/   shrink     shrink      remember 
           remember    remember   terms      terms       types
            types       terms     created     from        from
             from       from       from    subst/inst/   mk_type
           theories    theories  preterms   beta_conv
--------------------------------------------------------------------
TEST BASE        
TEST A       naive                                         yes
TEST B       naive      naive       yes         yes        yes
TEST C       naive      naive       yes                    yes
TETS D       naive      recurs      yes                    yes
TEST E       naive      constants   yes                    yes
TEST F       recurs     recurs

remember (naive) = The term/type is added to the table.  

remember (recurs) = An optimized way of adding entire terms/types to the table.
The term/type and all its subterms/types are stored in 
the hash table recursively, whilst simultaneously computing hash
values.  The operation is O(n), whereas doing a remember & hash on 
each subterm/subtype is O(n^2)

remember (constants) = Only add constant terms which are read from theory
files.

shrink = the term/type is optimized against the table, i.e. if an
identical copy exists in the table it is used instead.  The term/type is
not added to the table.

(Test F turns out to be the best.)


Table 2: Extra Size Added to image after save_hol after adding theories
and libraries.

THEORY	    	BASE	A	B	D	E	F
----------------------------------------------------------------------------
arithmetic	240	136	109	100	123	100
set		1026	561	553			507
List		1903	982	970			938

arithmetic +
list +
List +		3066	1918	1712	1696	1810	1694
string +
set
% Saving:	0%	37%	44%	45%	41%	44%
----------------------------------------------------------------------------
Nb. libraries include compiled code as well.
Nb. savings are doubled because SML copies heap

Table 3: Stats for rebuilding entire "real" library
(excluding building "rhol" base executable)

		BASE	B	C	D	E	F
---------------------------------------------------------------------------
user time (s)	477	c1500	881	679	623	579
% increase	0	200%	85%	42%	30%	21%
gc time (s)	88		108	104	105	99
max heap (K)	21310		19122	19010	19534	19258
---------------------------------------------------------------------------
(A) building "real" library is especially intensive on theory loading.
(B) tests for B not finished due to large increases in run times
indicating that optimizes in subst,inst and beta_conv aren't worth it.

Overall Notes:

Times were measured by using System.Timer.

The memory usage was measured by performing "save_hol"'s to /dev/null,
and looking at the KB of extra heap added, as reported in the final 
GC message.  I also looked at the highest level of heap allocated in
the GC messages.