Return-Path: <John.Harrison-request@cl.cam.ac.uk>
Delivery-Date: 
Received: from leopard.cs.byu.edu (no rfc931) by swan.cl.cam.ac.uk 
          with SMTP (PP-6.5) outside ac.uk; Wed, 11 May 1994 15:57:55 +0100
Received: by leopard.cs.byu.edu (1.37.109.8/16.2) id AA18338;
          Wed, 11 May 1994 08:40:34 -0600
Sender: info-hol-request@lal.cs.byu.edu
Errors-To: info-hol-request@lal.cs.byu.edu
Precedence: bulk
Received: from dworshak.cs.uidaho.edu by leopard.cs.byu.edu 
          with SMTP (1.37.109.8/16.2) id AA18334;
          Wed, 11 May 1994 08:40:33 -0600
Received: from oberon.inmos.co.uk by dworshak.cs.uidaho.edu 
          with SMTP (1.37.109.8/16.2) id AA11636;
          Wed, 11 May 1994 07:41:21 -0700
Received: from frogland.inmos.co.uk by oberon.inmos.co.uk;
          Wed, 11 May 1994 15:41:18 +0100
From: David Shepherd <des@inmos.co.uk>
Message-Id: <4335.9405111439@frogland.inmos.co.uk>
Subject: Re: does there exists a hol-lex?
To: info-hol@cs.uidaho.edu (info-hol mailing list)
Date: Wed, 11 May 1994 15:39:29 +0100 (BST)
X-Mailer: ELM [version 2.4 PL20]
Content-Type: text
Content-Length: 1420

Konrad Slind has said:
> > lexgen/mlyacc won't work well for a set of states large than 255,
> > because then, it switches to integer arrays to represent them and
> > there is a known bug with integer arrays, which causes SML/NJ to
> > increase memory to 100MB or even more. So, for large grammars
> > lexgen/mlyacc won't work very well up to now.
> 
> I think I was the first to discover this bug when I added
> quote/antiquote to the SML/NJ compiler, bumping the lexer from 242
> states to just over 255. It's very easy to get around: rather than
> having each alphanumeric keyword (say) as a separate production, have a
> single production for all alphanumerics and inside that, do a case
> statement on the string denoted by "yytext".

Hmm ... that's exactly the way that they do it in the example parser
for pascal in the mlyacc/examples/pascal directory. You don't even need
to code up the big case statement - just use the code there which
builds up a (hashed) array of keywords against lexemes.

Quite honestly I hadn't even thought that you might want to stick in
all the alphanumeric keywords as seperate production rules!

--------------------------------------------------------------------------
david shepherd: des@inmos.co.uk                     tel: 0454-616616 x 625
                inmos ltd, 1000 aztec west, almondsbury, bristol, bs12 4sq
		"I  am  not  a  nut      ---      I  am  a  human  being."
