Princeton University
Computer Science Dept.

Computer Science 441
Programming Languages
Fall 1998

Lecture 6

Lambda Calculus (continued)

Definability in the lambda calculus (continued)

Last time we discussed the encoding of natural numbers and how to use that to encode addition with the PLUS operator.

Multiplication is reasonably straightforward now that we have addition. The intuition is that we obtain n*m by applying the Plus n function to z a total of m times:

   Times = λm.λn.m 0 (Plus n)

Exercise: Prove by induction on m, that Times m n = m*n.

OK, so now we know we can define natural numbers and addition and multiplication (though it wasn't particularly easy!). What is truly amazing is that with this encoding we can define any computable function from the natural numbers to the natural numbers! That is, any function from the natural numbers to the natural numbers that can be defined in ML, C, C++, Java, Prolog, etc., can be written in the pure lambda calculus. This is pretty impressive given that we don't even have conditional expressions or recursion! However, it turns out that both of these are definable (like the natural numbers) in the lambda calculus. We begin with boolean values and if-then-else:

    True = λt.λf.t
    False = λt.λf.f

Again, the intuition is like that of the numbers. If someone gave us the real values of true and false we could apply these functions to them and get the desired answer. With these definitions of TrueandFalse, the definition of IfThenElse is trivial:

   IfThenElse = λb.λt.λe.b t e

The intuition is that IfThenElse True M N = M and IfThenElse False M N = N and it is easy to see that these are both correct using the definitions of True and False.

We can continue along this path (see Foundational Calculi for Programming Languages by Benjamin Pierce at http://www.cis.upenn.edu/~bcpierce/papers/crchandbook.ps.gz) and also define a pairing function, first and second projections from pairs, etc. One of the more challenging functions to write is actually the predecessor function, but even that can be written by defining a function which maps m to (m-1,m) and then project out the first component.

What about recursive functions? Well, we normally think of Plus and Times as being built recursively out of 0 and Succ, but we seemed to have no difficult there. In fact, we can define any recursive function in the lambda calculus by understanding recursive definitions in terms of fixed points .

When we write a recursive definition like

   Fact = λn. if n=0 then 1 else n*Fact(n-1)

we tend to believe that our definition actually uniquely defines a function (in spite of the fact that we are all aware that it is easy to write recursive functions that never converge to an answer). We would like to show that such a function is actually definable in the lambda calculus.

The basic idea is as follows. Rather than presume the function exists, instead build a higher-order function as follows:

   G = λf. λn. if n=0 then 1 else n*f(n-1)

In particular, G Fact evaluates to the right hand side of the definition of Fact. Thus, given the definition of G, if we could define a function Fact such that G(Fact) = Fact then we would be done. Because G applied to Fact returns itself, we call Fact a fixed point of G (because it isn't changed by G). The amazing fact about the lambda calculus is that every function has a fixed point!

Let Y =λf.(λx.f(xx))(λx.f(xx)). Then for any function h,

   Y h = f(Y h)

Thus, ( Y h) is always a fixed point of h.

Proof:

   Y h = (λf.(λx.f(xx))(λx.f(xx))) h =_β (λx.h(xx))(λx.h(xx)))
       =_β h((λx.h(xx))(λx.h(xx)))).

But notice that the last step is simply h applied to the result of the previous step. Thus

   Y h = h(Y h)

Thus Y h is a fixed point of h.

In our example above, we can now define Fact = Y G, so we have a definition of factorial in the pure lambda calculus (as long as we use our earlier encodings of numbers, if-then-else, multiplication, etc.). We can do the same with any other recursive function. That is, write it in terms of a higher-order function and then apply Y to that function to get the fixed point.

Properties of Reduction

A term of the lambda calculus may have many subterms that could be rewritten using the reduction rules. For example (λb.b) ((λx.λy.x) 0 1)may be evaluated first by applying the identity function to the large expression on the right using β-reduction, or the large expression on the right can be reduced once or twice using β-reduction. A subterm to which a reduction rule may be applied is called a redex.

It would be unfortunate if different orders of evaluation in the lambda calculus gave different answers. Luckily this is not the case:

Theorem (Church-Rosser): If M reduces to two different expressions, L₁andL₂, then these further reduce to a common expression L.

We say a term N of the lambda calculus is in normal form if no reduction rules apply. A corollary of the Church-Rosser theorem states that if M reduces to a normal form, then the normal form is unique.

Unfortunately, not all terms of the lambda calculus have normal forms. An example is the term Ω = (λx.xx)(λx.xx). Ω is clearly not in normal form, yet each β-reduction results in the same term again. An example which results in larger terms with each reduction is Ω' = (λx.xxx)(λx.xxx).

Even more interestingly, a term may have a normal form, but can also be reduced indefinitely. An example is the term ((λx.0)Ω). If we apply the constant function to Ω, we just get 0. However, if we first attempt to reduce the argument, Ω, we just keep reducing forever. Thus a bad reduction strategy can keep you from getting to a normal form, even if one exists.

The three most popular reduction strategies are

Normal order (or call-by-name):: Always reduce the redex whose λ appears furthest to the left in the term.
Applicative order (or call-by-value):: Always reduce the leftmost term of the form (λx.M)N where N is already in normal form.
Lazy order:: Reduce the left-most redex as in normal order, but only if it is not in the body of a function.

With our example ((λx.0)Ω), the normal order and lazy order reduction strategies both result in 0, but applicative order never converges.

Theorem: If N is normalizable, then a sequence of normal-order reductions on N always terminates in a normal form after a finite number of reductions. Similarly, lazy order always results in "weak head normal form" if the term has one.

Applicative order tends to be more efficient than normal order, but it may not terminate when normal order would.

BINDING TIME

Program elements have attributes which must be "bound" to them at some point.

Binding = fixing a value or some other property of an object from set of possibilities

MAKING A DECISION

Example: Bind variable to location and value

Time of making a decision is called binding time

Possibilities: Execution, translation, language implementation, language definition

Dynamic

Execution:

Entry to block or subprogram - bind actual to formal parameter, location of local variable.
Arbitrary points - values to variables via assignment

Static

Translation:

Determined by programmer - declarations bind type to variable name, values to constants
Determined by translator - global variable to location (load time) source program to object program representation

Implementation: Representation of values in computer, semantics of operations, statements - if not uniform may lead to diff. results on diff. machines

Language Def: Structure of language, possible types, rep of values in program text.

Example: When is meaning of "+" bound to its meaning in "x + 10"?

Language def, implementation, or translation time?
In some languages may be execution time (may depend on run-time type of x)

Difference between reserved and key words has to do with binding time

key word has intrinsic meaning (bound at language def/implementation)
Reserved word binding can't be changed by programmer

Example: "DO" is reserved word in Pascal, but not FORTRAN (can write DO = 10)

"Integer" may be redefined in Pascal, but not FORTRAN or Ada.

Why care about binding time?

Early vs. late binding - many language design decisions relate to binding time

Late - more flexible
Early - more efficient

Example: "+" bound at translation vs. execution time

Early binding supports compilation, late binding -> interpretation

Small changes may delay binding time -

Ex: recursion forces delay in binding time for local variables to locations

(FORTRAN can use static allocation vs stack-based allocation)

Generally considered useful to bind ASAP

As work down layers in examining or translating language, may find able to make more binding, e.g., by constant propagation - support optimizers.

Bindings are maintained in structures both at compile and at run-time.

During compilation, declarations stored in Symbol table.

Symbol table: Names -> Attributes

Most of these are then used in the compilation process and need not be saved.

Other attributes are needed at execution-time:

Run-time environment keeps track of meanings of names:

Environment: Names -> Locations

Contents of locations also changes during execution. Usually called memory or state:

Memory: Locations -> Values

With interpreter, just keep both sets of values together in one Environment.

(Notice that your homework interpreter has no run-time environment since there are no identifiers being interpreted yet!)

CS441 | CS Department | Princeton University