In order to help you prepare better for the final exam, I have pointed out useful exercises in the textbook and created a few extras of my own. These exercises are completely optional and will not be graded. I do not have example solutions. If you feel like writing up an example solution for your classmates, go right ahead. Some questions may not be perfectly well-defined. That is okay. Interpret them any way you want that helps you to study. I recommend practicing with these questions and other similar exercises before the exam. The best way to study is to try to do them on your own first before consulting with friends. Recommended Exercises --------------------- Lexing ------ Exercises 2.1, 2.2, 2.3, 2.8 Parsing ------- Exercises 3.1, 3.3-3.17 Exercises 4.1-4.6 Type Checking ------------- 0. Redo the questions on the midterm & ask questions if you don't understand how. 1. Using the typing rules for Fun, write down the typing derivation for the the fib function: fun fib (x:int):int = if x <= 1 then 1 else fib(x-1) + fib(x-2) 2. Prove that the following subtyping rule for tuples is wrong: ------------------------------------- (t1,...,tk) <= (t1,...,tk,tk+1,...tn) Your proof should be a well-typed example program that uses the subtyping rule above but crashes (show the typing derivation for the program and explain the result of evaluation). Since, the well-typed program crashes, the type system is not sound when it includes the rule above. 3. Similar to question two, prove that if we add a subtyping rule for functions that is contravariant in the function argument position, a well-typed program can crash (and therefore the rule is wrong). Before giving the counter example, write down the new subtyping rule. 4. Consider a variation of the Fun language that contains null values. type t ::= int | (t1,...,tk) | ?(t1,...,tk) expressions e ::= i | x | (null : t) | (e1,...,ek) | ifnonnull e then x.e1 else e2 | #i e The idea is that there are two sorts of tuples. The tuples with type ?(t1,...,tk) might be a pointer to data with k fields and might also be null. The tuples with type (t1,...,tk) are definitely not null. It is an error to project (use #i e) from a null pointer. To determine whether or not a value is null a programmer uses the expression ifnonnull e then x.e1 else e2. If e evaluates to a nonnull tuple then the first branch is taken and x is bound to the nonnull tuple. Otherwise, the second branch is taken. the typing rules are: ------------ G |- i : int ------------ (G(x) = t) G |- x : t t = ?(t1,...,tk) -------------------- G |- (null : t) : t G |- e1 : t1 .... G |- e1 : t1 ---------------------------------- G |- (e1,...,ek) : (t1,...,tk) G |- e : ?(t1,...,tk) G,x:(t1,...,tk) |- e1 : t G |- e2 : t ------------------------------------------------------------------ G |- ifnonnull e then x.e1 else e2 : t G |- e1 : (t1,...,tk) --------------------- G |- #i e : ti A) write down a well-typed program that uses every language construct. B) assume there is an additional typing rule for subsumption: G |- e :t ' t ' <= t ------------------------ G |- e : t Write down the best possible subtyping relation for the types in the language. C) Give an example of a program that does not type check without subtyping but does type check with subtyping. D) Write down an ML datatype for the abstract syntax of the expressions and types in the little language. E) Implement a type checker for the little language in ML. Use any standard utilities you choose. Stacks & Activation Records --------------------------- Exercises 6.3, 6.5, 6.6, 6.7 Garbage Collection -------------------- Exercises 13.3, 13.4 Auxiliary Exercise: 1) Answer the following questions: a) At run time, when does a value such as a record or an array become garbage? b) Explain why no one has developed the perfect garbage collector yet. c) Write some pseudocode that demonstrates how to implement reference counts for the instruction M[x + k] = y where y is a pointer and M[x + k] is the kth field of x, which contains either null (0) or a pointer. Assume that the reference count is held in memory at M[x-4]. Be explicit about any other assumptions you must make. d) Give the "best" reason why people do not use reference counting to implement garbage collection for languages like ML. Instruction selection -------------------- Exercises 9.1, 9.2, 9.3 Auxiliary Exercise: 1) Intel has come up with a new, super-fast arithmetic chip instruction set. It is a CISC chip in which the operations have widely differing costs. The table below lists the important arithmetic instructions and their costs. In the table below, $r_1$, $r_2$, $r_3$, etc. are registers whereas $c$ is a constant. instruction: cost: r_1=r_2 + r_3 1 r_1=r_2 + r_3 + r_4 3 r_1=r_2 + c 3 r_1=r_2 * r_3 5 r_1=r_2 * r_3 * r_4 8 r_1=r_2 + r_3 * r_4 5 r_1=c 1 Answer the following questions. Assume you are doing code generation starting with a tree-like intermediate language with binary (two-argument) arithmetic operations, as in the textbook. a) Write down the ``tree patterns'' that match each of the instructions listed above. b) Find the lowest cost sequence of instructions possible (the ``optimum'' sequence) to compute the following arithmetic expression, assuming that all variables are already in registers. (x * ((3 + y) * 17)) + ((y * x) + 14) c) Explain how to extend the algorithm for finding the ``optimum'' cost sequence of instructions so that it does common sub-expression elimination efficiently at the same time as instruction selection and generation. d) Create an example arithmetic expression with some common subexpressions and show how your new algorithm works on that expression. Program Analysis & Optimization ------------------------------- Exercise 10.1, 10.5 Exercises 17.1, 17.2, 17.4, 17.5, 17.6 Exercises 18.1, 18.2, 18.5, 18.7, 18.8 Exercises 19.1, 19.7, 19.8, 19.9, 19.11 Auxiliary Exercises: 1) You're the professor for a compiler construction class, and you have to write a final exam question on code optimization. Create a small code example for the exam which reduces to: return 100 after the following optimizations are applied (only once) in this order: Common Subexpression Elimination Hoisting loop-invariant computations Copy Propagation Constant Propagation Constant Folding Dead Code Elimination Create the answer key by showing the code before and after each optimization step. Each optimization should have some real impact (ie: your initial program should not just be "return 100"!!) Remember, the smaller the original code example, the easier it is for you to grade it. Hint: work backwards. 2) In Java, doing synchronization between concurrent threads is expensive. Therefore, it is important to identify the object allocation sites that create thread-local objects. A thread-local object is one that can only be referenced from a single thread. Your job is to design a data-flow analysis that can detect the thread-local allocation instructions. Assume that the intermediate language has the following instructions: x = alloc(C) (allocate object with class C) spawn(x) (spawn a new thread running x's "runnable" method) x = c (assign variable a constant integer) x = y + z (add integers) x = y (move value in y to x) jmp L (jump directly to label L) branchz r L (if r is 0 jump to L; otherwise fall through to next instruction) And allocation site (x = alloc(C)) may be marked "thread-local" if there does not exist a path through the program that allows the allocated object x from that instruction to appear as the argument to the spawn expression (spawn(x)). Write down pseudo-code for an algorithm that performs the above "thread-local" analysis assuming that you have a control-flow graph (CFG) as an input. Your pseudo-code should be high-level. For example, statements such as: For each node n in the CFG do { if n is (spawn x) then ... } are a perfectly good level of abstraction. Register Allocation ------------------- Exercises 11.1, 11.2, 11.3, 11.4