To: appel@princeton.edu
Cc: S.M.Kahrs@ukc.ac.uk
Subject: cup parser generator
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Date: Tue, 10 Mar 1998 16:58:43 +0000
From: "S.M.Kahrs" <S.M.Kahrs@ukc.ac.uk>

Hi Andrew,

I'm currently teaching a Compiling Techniques course in Java using your book
and jlex/cup.  I thought you might be interested in some feedback.

Here are a couple of points about cup which I saw students struggling with:

- default precedence: a token that has not been given an explicit
  precedence is given lower precedence than all the other tokens.  First
  of all: this is different from yacc which always resolves conflicts
  concerning an unprecedenced token in favour of shift.

  I would rather have the parser generators leave such conflicts as
  conflicts, possibly turning them into warnings.  I had a few students
  who fell into the trap of believing that a grammar was conflict-tree
  because cup said so but in truth cup resolved conflicts using the
  unknown-low-precedence principle, giving them the wrong result.  To be
  fair, yacc's solution would have given them the same problem, just for
  a slightly different reason.

- the syntax of cup is a bit overly rigid, e.g. that the action-code
  part must precede the parser-code part, and noisy e.g. these code
  sections are terminated by first ':}' and then ';'
[ed. note: issue addressed in CUP 0.10j]

- the interface between cup and jlex is not exactly brilliantly
  designed; my personal pet peeve (so far) is the treatment of
  end-of-file; I think jlex is more reasonable by returning null as
  default.
[ed. note: issue (at least partially) addressed in CUP 0.10j]

Another option I'd suggest for jlex is 'standalone', e.g. something like

%standalone

in the directives part.  The idea is to use jlex not as a scanner
generator for a compiler but as a generator for a finite state machine
(I personally use lex much more often for this purpose, e.g. for doing
some funny stuff with latex documents).  It is fairly easy to achieve
this with lex/flex, but currently one has to write a lot of stuff (Java
is soo noisy) to achieve this with jlex.

Stefan Kahrs