Parsing

In this assignment, you will use ML-Yacc to build a parser for the Fun language. Chapter 3 of your textbook (particularly 3.4) explains many of the concepts you'll need to know to complete this assignment. There is also documentation on ML-Yacc here. You will also need to refer to the Fun language reference.
  1. Make a new directory as3 for this assignment, and copy all your as2 files into it. Then unpack as3.zip to get some new files: sources.cm fun.grm parse.sml compile.sml all.fun testcases.sml.This will overwrite your sources.cm with a new one.
  2. For this assignment you will use the fun.lex you built in assignment 2. If yours doesn't work well enough to be usable, ask the Teaching Assistant for a working one - do not pass the TA's lexer on to others!
  3. Edit fun.grm until it specifies a parser for the Fun language. Don't worry at first about filling in the semantic actions; just get the parsing and grammar disambiguation taken care of.
    • The Fun language reference contains grammar rules that describe the syntactic structure of Fun.
    • You will need to specify terminal names that correspond to the names of tokens generated by the lexer. ML-Yacc automatically generates a structure that will build the tokens. In other words, fun.grm will need to contain the following code (extended with the other terminal names):

      %term EOF
      | INT of int
      | ID of string
      | COMMA | SEMICOLON | COLON
      | LPAREN | RPAREN | ...


      The nonterminals you use can have any names that you want. For example,

      %nonterm foo of Absyn.exp | ...

      But I recommend more descriptive names than "foo".
    • The Fun language definition isn't 100% clear about the exact grammar you have to use--that's part of the "Fun" of this assignment. However, the file testcases.sml should be very helpful here: it's a list of pairs, where each pair of strings should parse into the same abstract syntax.
    • If you use precedence directives, do so in a reasoned way. In other words, don't just throw in precedence directives to correct any random error in your grammar; be able to explain in your README how your use of precedence directives achieves the desired result, perhaps with reference to the fun.grm.desc file generated by ml-yacc.
    • Compile using CM.make "sources.cm"; This will automatically run ml-lex and ml-yacc as necessary. Test your parser by invoking Compile.compile "filename". This calls Parse.parse and then invokes the interpreter on the file. For example, use "all.fun" as the filename.
    • Another way to test is to do val x = Compile.go(); For each pair of strings in testcases.sml, it runs the parser on that pair of strings, and compares the abstract syntax (after all the Pos markers have been removed).
    • Your goal is to develop a parser with no reduce/reduce conflicts, and only benign shift-reduce conflicts. If you cannot eliminate certain conflicts, explain in your README why they occur and why they are benign (you can refer to, and quote from, your fun.grm.desc file).
  4. Once you have your grammar disambiguated and debugged, fill in the parser's semantic actions. These should generate abstract syntax in the Fun intermediate language. Here are a list of concerns to keep in mind.
    • Some of the source level operations do not appear in the intermediate language. For instance, the "and", "or", "sequencing" (i.e., semicolon), and "unary minus" need to be compiled into other existing constructs. For example, "e1 and e2" can translate as if it were "if e1 then (if e2 then 1 else 0) else 0".
    • Each object that you generate in the abstract syntax should be wrapped in a Fun.Pos constructor that marks the beginning and ending locations of the construct in the source code.
  5. Add some support for detection and correction of common errors, using ML-Yacc constructs such as "%value" and "%change" described on pages 80-81 of MCIinML and in the ML-Yacc manual. Explain the support you add in your README file.
  6. The file all.fun tests all syntactic constructs in a trivial way but you will certainly want to write other fun files to debug your compiler. Collect three informative test cases in separate files myparsetestX.fun, for X=1,..,3. In a /* comment */ at the beginning of each test file, explain what particularly is being tested; if your test is expected to produce parsing errors, copy the expected error message(s) into the comment. If you want you can send additional test files (with maybe more informative file names) to the cos320 mailing list or post them on piazza, for use by your fellow students. We will evaluate the quality of your parser on your own test files, test files submitted by others, test files submitted to piazza, and our own test files.
  7. In the README, also describe any help you received or gave as usual, point out any design decisions you made, and confirm that the work is your own.
  8. Submit your README, fun.grm, and your myparsetestX.fun test. You may also submit a new version of your lexer (fun.lex), provided you describe the changes you made in the README. Upload you submissions to dropbox here.

Due date:Monday, 7 March 2016, 9:00 PM