Delivery-Date: 
Sender: info-hol-request@lal.cs.byu.edu
Errors-To: info-hol-request@lal.cs.byu.edu
Precedence: bulk
From: David Shepherd <des@inmos.co.uk>
Message-Id: <4335.9405111439@frogland.inmos.co.uk>
Subject: Re: does there exists a hol-lex?
To: info-hol@cs.uidaho.edu (info-hol mailing list)
Date: Wed, 11 May 1994 15:39:29 +0100 (BST)
Content-Type: text
Content-Length: 1420

Konrad Slind has said:
> > lexgen/mlyacc won't work well for a set of states large than 255,
> > because then, it switches to integer arrays to represent them and
> > there is a known bug with integer arrays, which causes SML/NJ to
> > increase memory to 100MB or even more. So, for large grammars
> > lexgen/mlyacc won't work very well up to now.
> 
> I think I was the first to discover this bug when I added
> quote/antiquote to the SML/NJ compiler, bumping the lexer from 242
> states to just over 255. It's very easy to get around: rather than
> having each alphanumeric keyword (say) as a separate production, have a
> single production for all alphanumerics and inside that, do a case
> statement on the string denoted by "yytext".

Hmm ... that's exactly the way that they do it in the example parser
for pascal in the mlyacc/examples/pascal directory. You don't even need
to code up the big case statement - just use the code there which
builds up a (hashed) array of keywords against lexemes.

Quite honestly I hadn't even thought that you might want to stick in
all the alphanumeric keywords as seperate production rules!

--------------------------------------------------------------------------
david shepherd: des@inmos.co.uk                     tel: 0454-616616 x 625
                inmos ltd, 1000 aztec west, almondsbury, bristol, bs12 4sq
		"I  am  not  a  nut      ---      I  am  a  human  being."