|
Computer Science 441
Programming Languages
Fall 1998
Lecture 6 |
Lambda
Calculus (continued)
Definability in the lambda calculus (continued)
Last time we discussed the encoding of natural numbers and how to use
that to encode addition with the PLUS operator.
Multiplication
is reasonably straightforward now that we have addition. The intuition is
that we obtain
n*m
by applying the
Plus
n
function to
z
a total of
m
times:
Times = λm.λn.m 0 (Plus n)
Exercise:
Prove by induction on
m,
that
Times
m n
=
m*n.
OK,
so now we know we can define natural numbers and addition and
multiplication
(though it wasn't particularly easy!). What is truly amazing is that
with this encoding we can define any computable function from the natural
numbers to the natural numbers! That is, any function from the natural
numbers
to the natural numbers that can be defined in ML, C, C++, Java, Prolog,
etc.,
can be written in the pure lambda calculus. This is pretty impressive
given
that we don't even have conditional expressions or recursion!
However,
it turns out that both of these are definable (like the natural numbers)
in the
lambda calculus. We begin with boolean values and if-then-else:
True = λt.λf.t
False = λt.λf.f
Again,
the intuition is like that of the numbers. If someone gave us the real
values
of true and false we could apply these functions to them and get the
desired
answer. With these definitions of
True
and
False
,
the definition of
IfThenElse
is trivial:
IfThenElse = λb.λt.λe.b t e
The
intuition is that
IfThenElse
True M N = M
and
IfThenElse
False M N = N
and it is easy to see that these are both correct using the definitions of
True
and
False.
We
can continue along this path (see Foundational Calculi for Programming
Languages by Benjamin Pierce at
http://www.cis.upenn.edu/~bcpierce/papers/crchandbook.ps.gz)
and also define a pairing function, first and second projections from
pairs,
etc. One of the more challenging functions to write is actually the
predecessor function, but even that can be written by defining a function
which
maps
m
to
(m-1,m)
and then project out the first component.
What
about recursive functions? Well, we normally think of
Plus
and
Times
as being built recursively out of
0
and
Succ,
but we seemed to have no difficult there. In fact, we can define any
recursive
function in the lambda calculus by understanding recursive definitions in
terms
of
fixed
points
.
When
we write a recursive definition like
Fact = λn. if n=0 then 1 else n*Fact(n-1)
we
tend to believe that our definition actually uniquely defines a function
(in
spite of the fact that we are all aware that it is easy to write recursive
functions that never converge to an answer). We would like to show that
such a
function is actually definable in the lambda calculus.
The
basic idea is as follows. Rather than presume the function exists, instead
build a higher-order function as follows:
G = λf. λn. if n=0 then 1 else n*f(n-1)
In
particular,
G
Fact
evaluates to the right hand side of the definition of
Fact.
Thus, given the definition of
G,
if we could define a function
Fact
such that
G(Fact)
=
Fact
then we would be done. Because
G
applied to
Fact
returns itself, we call
Fact
a
fixed
point
of
G
(because it isn't changed by
G).
The amazing fact about the lambda calculus is that every function has a
fixed
point!
Let
Y
=
λf.(λx.f(xx))(λx.f(xx)).
Then for any function h,
Y h = f(Y h)
Thus,
(
Y
h)
is always a fixed point of h.
Proof:
Y h = (λf.(λx.f(xx))(λx.f(xx))) h =β (λx.h(xx))(λx.h(xx)))
=β h((λx.h(xx))(λx.h(xx)))).
But
notice that the last step is simply h applied to the result of the previous
step. Thus
Y h = h(Y h)
Thus
Y
h
is a fixed point of
h.
In
our example above, we can now define
Fact
=
Y
G
,
so we have a definition of factorial in the pure lambda calculus (as long
as we
use our earlier encodings of numbers, if-then-else, multiplication,
etc.). We
can do the same with any other recursive function. That is, write it in
terms
of a higher-order function and then apply
Y
to that function to get the fixed point.
Properties
of Reduction
A
term of the lambda calculus may have many subterms that could be rewritten
using the reduction rules. For example
(λb.b)
((λx.λy.x)
0 1)
may
be evaluated first by applying the identity function to the large
expression on
the right using
β-reduction,
or the large expression on the right can be reduced once or twice using
β-reduction.
A subterm to which a reduction rule may be applied is called a
redex.
It
would be unfortunate if different orders of evaluation in the lambda
calculus
gave different answers. Luckily this is not the case:
Theorem
(Church-Rosser): If
M
reduces to two different expressions,
L1
and
L2,
then these further reduce to a common expression
L.
We
say a term
N
of the lambda calculus is in normal form if no reduction rules apply. A
corollary of the Church-Rosser theorem states that if M reduces to a normal
form, then the normal form is unique.
Unfortunately,
not all terms of the lambda calculus have normal forms. An example is the
term
Ω
=
(λx.xx)(λx.xx).
Ω
is clearly not in normal form, yet each
β-reduction
results in the same term again. An example which results in larger terms
with
each reduction is
Ω'
=
(λx.xxx)(λx.xxx).
Even
more interestingly, a term may have a normal form, but can also be reduced
indefinitely. An example is the term
((λx.0)
Ω).
If we apply the constant function to
Ω,
we just get
0.
However, if we first attempt to reduce the argument,
Ω,
we just keep reducing forever. Thus a bad reduction strategy can keep you
from
getting to a normal form, even if one exists.
The
three most popular reduction strategies are
- Normal order (or call-by-name):
- Always reduce the redex whose
λ
appears furthest to the left in the term.
-
Applicative order (or call-by-value):
- Always reduce the leftmost term of
the
form
(λx.M)N
where
N
is already in normal form.
- Lazy order:
- Reduce the left-most redex as in normal order, but only if it is
not in
the body of a function.
With
our example
((λx.0)
Ω),
the normal order and lazy order reduction strategies both result in
0,
but applicative order never converges.
Theorem:
If N is normalizable, then a sequence of normal-order reductions on N
always
terminates in a normal form after a finite number of reductions.
Similarly,
lazy order always results in "weak head normal form" if the
term
has one.
Applicative
order tends to be more efficient than normal order, but it may not
terminate
when normal order would.
Program elements have attributes which must be "bound" to them at some point.
Binding = fixing a value or some other property of an object from set of
possibilities
MAKING A DECISION
Example: Bind variable to location and value
Time of making a decision is called binding time
Possibilities: Execution, translation, language implementation, language
definition
Dynamic
Execution:
- Entry to block or subprogram - bind actual to formal
parameter, location of local variable.
- Arbitrary points - values to variables via assignment
Static
Translation:
- Determined by programmer - declarations bind type to
variable name, values to constants
- Determined by translator - global variable to location (load time)
source program to object program representation
Implementation: Representation of values in computer, semantics of
operations, statements - if not uniform may lead to diff. results on diff. machines
Language Def: Structure of language, possible types, rep of values in
program text.
Example: When is meaning of "+" bound to its meaning in "x + 10"?
- Language def, implementation, or translation time?
- In some languages may be execution time (may depend on run-time type of x)
Difference between reserved and key words has to do with binding time
- key word has intrinsic meaning (bound at language def/implementation)
- Reserved word binding can't be changed by programmer
Example: "DO" is reserved word in Pascal, but not FORTRAN (can write DO
= 10)
"Integer" may be redefined in Pascal, but not FORTRAN or Ada.
Why care about binding time?
Early vs. late binding - many language design decisions
relate to binding time
- Late - more flexible
- Early - more efficient
Example: "+" bound at translation vs. execution time
Early binding supports compilation, late binding -> interpretation
Small changes may delay binding time -
Ex: recursion forces delay in binding time for local variables to
locations
- (FORTRAN can use static allocation vs stack-based allocation)
Generally considered useful to bind ASAP
As work down layers in examining or translating language, may find able to make
more binding, e.g., by constant propagation - support optimizers.
Bindings are maintained in structures both at compile and at run-time.
During compilation, declarations stored in Symbol table.
- Symbol table: Names -> Attributes
Most of these are then used in the compilation process and need not be saved.
Other attributes are needed at execution-time:
Run-time environment keeps track of meanings of names:
- Environment: Names -> Locations
Contents of locations also changes during execution. Usually called memory or
state:
- Memory: Locations -> Values
With interpreter, just keep both sets of values together in one
Environment.
(Notice that your homework interpreter has no run-time environment since there
are no identifiers being interpreted yet!)
CS441 |
CS Department | Princeton University