In order to help you prepare better for the final exam, 
I have pointed out useful exercises in the textbook and 
created a few extras of my own.  These exercises are completely 
optional and will not be graded.  I do not have example solutions.  
If you feel like writing up an example solution for your classmates, 
go right ahead.  Some questions may not be perfectly well-defined.  
That is okay.  Interpret them any way you want that helps you to study.

I recommend practicing with these questions and other similar exercises 
before the exam.  The best way to study is to try to do them on your own first 
before consulting with friends.

Recommended Exercises
---------------------

Lexing 
------ 

Exercises 2.1, 2.2, 2.3, 2.8

Parsing
-------

Exercises 3.1, 3.3-3.17
Exercises 4.1-4.6

Type Checking
-------------

0.  Redo the questions on the midterm & ask questions if you don't understand how.

1.  Using the typing rules for Fun, write down the typing derivation for the the fib function:

fun fib (x:int):int = 
  if x <= 1 then 1 else fib(x-1) + fib(x-2)

2.  Prove that the following subtyping rule for tuples is wrong:

-------------------------------------
(t1,...,tk) <= (t1,...,tk,tk+1,...tn)

Your proof should be a well-typed example program that uses the 
subtyping rule above but crashes (show the typing derivation for the program
and explain the result of evaluation).  Since, the well-typed program crashes,
the type system is not sound when it includes the rule above.

3.  Similar to question two, prove that if we add a subtyping rule for functions 
that is contravariant in the function argument position, a well-typed program
can crash (and therefore the rule is wrong).  Before giving the counter example, 
write down the new subtyping rule.

4.  Consider a variation of the Fun language that contains null values.

type t ::= int | (t1,...,tk) | ?(t1,...,tk)
expressions e ::= i | x | (null : t) | (e1,...,ek) 
                | ifnonnull e then x.e1 else e2 | #i e

The idea is that there are two sorts of tuples.  The tuples with type
?(t1,...,tk) might be a pointer to data with k fields and might also be
null.  The tuples with type (t1,...,tk) are definitely not null.  It is an
error to project (use #i e)  from a null pointer.  To determine whether or not
a value is null a programmer uses the expression ifnonnull e then x.e1 else e2.
If e evaluates to a nonnull tuple then the first branch is taken and x is bound
to the nonnull tuple.  Otherwise, the second branch is taken.

the typing rules are:

------------
G |- i : int

------------ (G(x) = t)
G |- x : t

t = ?(t1,...,tk) 
--------------------
G |- (null : t) : t

G |- e1 : t1   ....   G |- e1 : t1
----------------------------------
G |- (e1,...,ek) : (t1,...,tk)

G |- e : ?(t1,...,tk)     G,x:(t1,...,tk) |- e1 : t   G |- e2 : t
------------------------------------------------------------------
G |- ifnonnull e then x.e1 else e2 : t

G |- e1 : (t1,...,tk) 
---------------------
G |- #i e : ti


A) write down a well-typed program that uses every language construct.

B) assume there is an additional typing rule for subsumption:

G |- e :t '    t ' <= t
------------------------
G |- e : t 

Write down the best possible subtyping relation for the types in the language.

C) Give an example of a program that does not type check without subtyping but does
type check with subtyping.

D) Write down an ML datatype for the abstract syntax of the expressions and
types in the little language.  

E) Implement a type checker for the little language in ML.  Use any standard
utilities you choose.

Stacks & Activation Records
---------------------------

Exercises 6.3, 6.5, 6.6, 6.7

Garbage Collection
--------------------

Exercises 13.3, 13.4

Auxiliary Exercise:

1) Answer the following questions:

a) At run time, when does a value such as a record or an array become garbage?
b) Explain why no one has developed the perfect garbage collector yet.
c) Write some pseudocode that demonstrates how to implement reference
counts for the instruction

M[x + k] = y

where y is a pointer and M[x + k] is the kth field of x, which
contains either null (0) or a pointer.  Assume that the reference count
is held in memory at M[x-4].  Be explicit about any other assumptions 
you must make.

d) Give the "best" reason why people do not use reference counting to implement
garbage collection for languages like ML.

Instruction selection
--------------------

Exercises 9.1, 9.2, 9.3

Auxiliary Exercise:

1) Intel has come up with a new, super-fast arithmetic chip instruction set.
It is a CISC chip in which the operations have widely differing costs.
The table below lists the important arithmetic instructions and their
costs.  In the table below, $r_1$, $r_2$, $r_3$, etc. are registers
whereas $c$ is a constant.

instruction:		cost:
r_1=r_2 + r_3		1
r_1=r_2 + r_3 + r_4	3
r_1=r_2 + c		3
r_1=r_2 * r_3		5
r_1=r_2 * r_3 * r_4	8
r_1=r_2 + r_3 * r_4	5
r_1=c			1

Answer the following questions.  Assume you are doing code generation
starting with a tree-like intermediate language with binary (two-argument)
arithmetic operations, as in the textbook.

a) Write down the ``tree patterns'' that match each of the
instructions listed above.

b) Find the lowest cost sequence of instructions possible
(the  ``optimum'' sequence) to compute the following arithmetic expression,
assuming that all variables are already in registers.

(x * ((3 + y) * 17)) + ((y * x) + 14)

c) Explain how to extend the algorithm for finding the ``optimum''
cost sequence of instructions so that it does common sub-expression
elimination efficiently at the same time as instruction
selection and generation.

d) Create an example arithmetic expression with some common subexpressions
and show how your new algorithm works on that expression.


Program Analysis & Optimization
-------------------------------

Exercise 10.1, 10.5
Exercises 17.1, 17.2, 17.4, 17.5, 17.6
Exercises 18.1, 18.2, 18.5, 18.7, 18.8
Exercises 19.1, 19.7, 19.8, 19.9, 19.11

Auxiliary Exercises:

1) You're the professor for a compiler construction class, and you have
to write a final exam question on code optimization.  Create a small
code example for the exam which reduces to:

return 100

after the following optimizations are applied (only once) in this order:

Common Subexpression Elimination
Hoisting loop-invariant computations
Copy Propagation
Constant Propagation
Constant Folding
Dead Code Elimination

Create the answer key by showing the code before and after each
optimization step.  Each optimization should have some real impact 
(ie: your initial program should not just be "return 100"!!) 
Remember, the smaller the original code example,
the easier it is for you to grade it.  Hint: work backwards.

2) In Java, doing synchronization between concurrent threads is 
expensive.  Therefore, it is important to identify the object
allocation sites that create thread-local objects.  A thread-local
object is one that can only be referenced from a single thread.
Your job is to design a data-flow analysis that can detect the
thread-local allocation instructions.  Assume that the intermediate 
language has the following instructions:

x = alloc(C)  (allocate object with class C)
spawn(x)      (spawn a new thread running x's "runnable" method)
x = c         (assign variable a constant integer)
x = y + z     (add integers)
x = y         (move value in y to x)
jmp L         (jump directly to label L)
branchz r L   (if r is 0 jump to L; otherwise fall through to next instruction)

And allocation site (x = alloc(C)) may be marked "thread-local" if
there does not exist a path through the program
that allows the allocated object x from
that instruction to appear as the argument to the spawn expression (spawn(x)).

Write down pseudo-code for an algorithm that performs the above 
"thread-local" analysis assuming that you have a control-flow graph (CFG)
as an input.  Your pseudo-code should be high-level.  For example,
statements such as:

For each node n in the CFG do {
  if n is (spawn x) then ...
}

are a perfectly good level of abstraction.


Register Allocation
-------------------

Exercises 11.1, 11.2, 11.3, 11.4