COS 320 - Assignment 4

Compiling Techniques, Spring 2012, Princeton University.      Due date: Monday 5 March.

Lexical analyzer


Setup:
Make a new directory as4 for this assignment by unpacking as4.zip.

Run sml and compile the code using CM.make "sources.cm";

Build a lexer for the Fun language. In order to do this, you will need to figure out all of the tokens that you need to lex, by reading The definition of Fun.

You will be building your lexer using ML-LEX. There is online documentation on ML-LEX here. There is also some help in your textbook (Appel, chapter 2).

  1. Read and understand the following files: sources.cm, errormsg.sml, tokens.sig, tokens.sml, fun.lex, runlex.sml.
    (Actually, you can get by without understanding the stuff about "('svalue,'pos) token" in tokens.sig. However, do the experiment of writing a little program that just calls Tokens.ARROW(0,0) just to see what it's doing. You can even make this call from the SML/NJ interactive top-level prompt.)

    (Also, I wouldn't think any less of you if you didn't bother to understand the first four lines of fun.lex; that's just boilerplate for attaching the lexer to the parser.)

  2. Edit the code in fun.lex. You will have to remove some of the sample code and add a lot of your own.
  3. Nested comments. This assignment would be a lot easier if the Fun language didn't have nested comments. I recommend that you first get everything working except nested comments. Then add that feature as your time permits; it is worth 10% of this assignment's grade.

What to turn in

  • Submit myfun.sml, fun.lex, as well as a README. In the README, please write,
    Upload your submission to dropbox here.

    Caveat: Some of the example code in the ML-Lex documentation is written for an earlier version of SML; for example, the library function "revfold" used in the documentation is now called "foldr".

  • Back to COS 320 front page