COS 320 - Assignment 2

Compiling Techniques, Spring 2012, Princeton University.      Due date: Monday 20 February.
Implement alpha-renaming and constant-folding for the Fun language. First read the Fun language definition and read Section 5.1 of Modern Compiler Implementation.

Part A: Abstract Syntax

Translate (by hand) a short Fun program into abstract syntax constructors.
  1. Read and understand the following files: absyn.sml test.sml. Notice that in test.sml there are five Fun programs translated by hand into abstract syntax constructors.
  2. After your CM.make, execute Test.run(); from the sml interactive prompt. This will pretty-print and evaluate all five programs.
  3. Read the following files, but you're not responsible for understanding them in detail: heap.sig heap.sml eval.sig eval.sml funpp.sig funpp.sml
  4. Read the following files, which are but most likely you will never be responsible for understanding them in detail: table.sig table.sml
  5. Finish myfun.sml by translating the given program into abstract syntax constructors.

Part B. Alpha conversion

Consider the following two programs (one in Fun, the other in C):
fun f(x:int) : int =
 let a = x + 2 in
  let x = a*a in
    x+1
            
int f(int x) {
  int a = x+2;
  {int x = a*a;
   return x+1;
  }
}
In both cases there is an inner variable x that creates a hole in the scope of the outer variable x. In each case, the two different x's are distinct variables, and one could systematically rename either one of the variables to get an equivalent program. Let us rename the inner variable:
fun f(x:int) : int =
 let a = x + 2 in
  let x2 = a*a in
    x2+1
            
int f(int x) {
  int a = x+2;
  {int x2 = a*a;
   return x2+1;
  }
}
Renaming the definition and all uses of a local variable is called "alpha conversion" (defined by the great Princeton mathematician Alonzo Church in the 1930s). Write a program that systematically renames all local variables in a Fun program so that no variable creates a hole in the scope of another. Avoiding holes in scope will make certain compiler optimizations easier to implement.

Examine the file cfold.sml. The module interface ("signature") declares two functions; the first is rename_prog that takes one parameter of type Absyn.prog and returns a result of the same type. This is to be the alpha-converter. Inside the module implementation ("structure") you'll see that rename_prog is implemented to return the original program; you should make it do alpha-conversion.

The suggested method is to take a variable named (for example) foo and, if its definition is in the scope of another variable named foo, rename it to foo'. However, you'll need to add enough primes (apostrophes) to its name to avoid all clashes with variables in scope. Then, keep track of the current renaming of foo, and in the body of the let...in expression that binds foo, rename all the applied occurrences.

The algorithm, therefore, is to do a recursive traversal of each function body, keeping track of two different things:

  1. The set of variables currently in scope
  2. The current renaming of each variable
You can do both of these with an environment consisting of one symbol table. That is, let
type env = Symbol.symbol Symbol.table
Each env maps a symbol to a symbol option, that is, either NONE or SOME(id) for some symbol id.

The invariants are:

  1. If (and only if) an identifier is in scope, then the environment maps it to SOME(_), where the _ stands for any arbitrary value.
  2. The environment maps each identifer to its current renaming.
Notice that these two items correspond exactly to the "two different things" that you need to keep track of.

Now, before you start traversing each individual function definition, you need to record in the environment all the global variables that are in scope. The globals are the library functions and the functions declared in this program. There is exactly one library function named printint, and you can find the names of the functions in this program by traversing the list of fundecs. All these names need to be added to the initial environment, each name mapping to itself (since it can't occur within the scope of an identically named variable, it doesn't need to be renamed).

The formal parameter of each function, and the bound variable of each let, may need to be renamed. Every variable occurring in an expression must be mapped to its current renaming.

Beware! In the expression let id = e1 in e2, the expression e1 is not in the scope of the new definition! Thus, you must traverse e1 using the environment that was in effect before the let was encountered. The expression e2 is in the scope of the new id.

To add a prime to a variable name, you can convert the symbol to a string using Symbol.name, then add an apostrophe using string concatenation, s ^ "'", then convert back to a symbol using Symbol.symbol.

Implementation. Download as2.zip and examine compile.sml. This is a driver that parses a Fun program and hands it to your alpha-converter. It also prints and runs the program, both before and after alpha-conversion. Since alpha-conversion is supposed to produce an equivalent program, you should expect to get the same result by running the program.

You can assume the input program is well formed; for example, it is illegal for a Fun program to have two different top-level function declarations with the same name, but you can assume that won't happen. (It is not illegal for two different functions to have a formal parameter or local variable with the same name.) I have put in the stub of a function bind that you might want to apply whenever you come across the binding occurrence of a variable (a function parameter x, let x = ..., or in the first pass, even a function name x). This takes a current environment and the name x (of the variable being bound here), and returns a new environment and a new variable such that:

  1. The new variable is distinct from any variable in the domain of the current environment, and
  2. The new environment maps x to the new variable.
The implementation of bind should keep adding primes to the name of x until it finds one that is not in use.

Part C: Constant folding

The compiler optimizations of constant propagation, copy propagation, and constant folding can all be performed on the Fun abstract syntax. One can perform all three of these optimizations in one pass over a function body.

Implementation. Replace the stub function cfold_prog in cfold.sml with an optimization function. You will find it handy to have an environment that maps each variable to its current rewriting, that is,

type env' = Absyn.exp Symbol.table

You can assume that the alpha-converter has been run before cfold_prog, so therefore there will never be a variable nested within the scope of another variable with the same name.

Part D: Analysis

Why should we limit copy propagation to variables? Why not do general beta reduction, that is, rewrite let x=E1 in E2 into E2[E1/x] for arbitrary expressions E1?

Write a very short Fun function

slower(x: int) : int = ... 
that performs one multiplication and one addition; but if you were to apply general beta reduction, it would do two multiplications and one addition. Show the function slower' that results from doing the transformation.

Write a very short Fun function

wrong(x: int) : int = ... 
that produces a different answer if you apply general beta reduction to it. (Hint: use ref) Show the function wrong' that results from doing the transformation.

Upload here. Submit a README file along with your cfold.sml file. Into it put:

  1. A brief summary of what help you gave or received on this assignment (to/from other students or the instructors). In general it is NOT appropriate to read other students' programs, or to let other students read your programs; but you may discuss the homeworks with other students.
  2. [optional] Any other comments you have about this homework or your solution to it.
  3. Your functions slower, slower', wrong, wrong'.

Back to COS 320 front page