COS441 Lecture 4

Princeton University
Computer Science Dept.

Computer Science 441
Programming Languages
Fall 1998

Lecture 4

More functional languages

Higher-order functions

Fcnal languages provide new kind of "glue" allowing programmer to write small modules and "glue" them together into larger programs.

Can build own glue by writing higher order functions.

Can write product on lists by writing

    fun prod [] = 1   
      | prod (head::rest) = head * prod rest

Similarly for

    fun sum [] = 0   
      | sum (head::rest) = head + sum rest

Notice general pattern and write higher-order "listify" function:

    fun listify oper identity [] = identity   
      | listify oper identity (fst::rest) =    
                            oper(fst,listify oper identity rest);   
    val listify = fn : ('a * 'b -> 'b) -> 'b -> 'a list -> 'b

then

    val listsum = listify (op +) 0;
   
    val listmult = listify (op *) 1;
   
    val length = let fun add1(x,y) = 1 + y    
                        in listify add1 0    
                        end;
   
    fun append a b = let fun cons(x,y) = (x::y)    
                                in listify cons b a    
                                end;

Can define other higher-order functions as glue also.

Can also string together programs as pipeline generating, filtering and transforming data.

(Works best with lazy evaluation)

Look back at Sqrt function w/ lazy lists.

    fun sqrtapprox x eps = within eps (approxsqrts x)

Think of program as composition of boxes, glued together by pipes (like UNIX pipes).

Lazy evaluation gives proper behavior so don't stack up lots of data between boxes.

Last box requests data from earlier boxes, etc.

In general try to write general boxes which generate, transform and filter data.

Program Correctness

Referential transparency is key to ease of program verification, because we can replace identifiers by their values.

I.e. If have

    let val I = E in E' end;

then get same value by evaluating E'[E/I], i.e., replace all occurrences of I by E in E' and then evaluate.

Thus we can reason that:

    let val x = 2 in x + x end   
    = 2 + 2   
    = 4

If side effects are allowed then this reasoning fails:

Suppose print(n) has value n and induces a side-effect of printing n on the screen. Then

    let val x = print(2) in x + x end   
             != print(2) + print(2)

Interestingly, our proof rule only works for lazy evaluation:

    let val x = m div n in 3 end;

= 3 only if n != 0!

In lazy evaluation this is always true.

Therefore can use proof rule only if guarantee no side effects in computation and all parameters and expressions converge (or use lazy evaluation).

General theorem: Let E be a functional expression (with no side effects). If E converges to a value under eager evaluation then E converges to the same value with lazy evaluation (but not vice-versa!!)

Let's see how you can give a proof of correctness of a functional program:

    fun fastfib n : int = 
          let    
            fun fibLoop a b 0 = a   
              | fibLoop a b n : int = fibLoop  b (a+b) (n-1)   
          in 
            fibLoop 1 1 n   
          end;

Prove fastfib n = fib n where

    fun fib 0 = 1   
     |  fib 1 = 1   
     |  fib n = fib (n-2) + fib (n-1);

Let a_i = fib i, for all i.

Therefore a₀ = a₁ = 1, and a_i + a_i+1 = a_i+2 for all i >= 0, by def of fib.

Theorem: For all i, fibLoop a_i a_i+1 n = a_i+n.

Pf by induction on n:

If n = 0, fibLoop a_i a_i+1 0 = a_i = a_i+0 by def.

Suppose true for n - 1:

Then

    fibLoop a_i a_i+1 n = fibLoop a_i+1 (a_i + a_i+1) (n - 1) 
                      = fibLoop a_i+1 a_i+2 (n - 1)
                      = a_i+1+(n-1) = a_i+n.

Now

    fastfib n = fibLoop 1 1 n 
              = fibLoop a₀ a₁ n 
              = a_0+n 
              = a_n

by the Theorem.

Therefore, for all n, fastfib n = fib n.

Similar proofs can be given for other facts, e.g.,

    nlength (append l1 l2) = nlength(l1) + nlength(l2)

where

    fun nlength [] = 0   
      | nlength (h::rest) = 1 + nlength rest

and

    fun append [] l2 = l2   
      | append (h::rest) l2 = h :: (append rest l2)

Imperative features - references

Ref is a built-in constructor that creates references (i.e. addresses)

Example

    - val p = ref 17   
    val p = ref 17 : int ref

Can get at value of reference by writing !p

    - !p + 3;   
    val 20 : int

Also have assignment operator ":="

    - p := !p + 1;   
    () : unit   
    - !p;   
    val 18 : int

Other imperative commands:

(E1; E2; ...; En) - evaluate all expressions (for their side-effects), returning value of En

while E1 do E2 - evaluates E2 repeatedly until E1 is false (result of while always has type unit)

Writing Pascal programs in ML:

    fun decrement(counter : int ref) = counter := !counter - 1;
   
    fun fact(n) = let 
                     val counter = ref n; 
                     val total = ref 1;   
                  in 
                     while !counter > 1 do   
                        (total := !total * !counter ;   
                         decrement counter);   
                     !total   
                  end;

There are restrictions on the types of references - e.g., can't have references to polymorphic objects (e.g., nil or polymorphic fcns). See discussion of non-expansive expressions in section 5.3.1. Essentially, only function definitions (or tuples of them) can have polymorphic type. Results of function applications or values of references can never be polymorphic.

Implementation issues

Efficiency:

Functional languages have tended not to run as fast as imperative: Why?

Use lists instead of arrays - linear time rather than constant to access elements

Passing around fcns can be expensive, local vbles must be retained for later execution. Therefore must allocate from heap rather than stack.

Recursion typically uses lot more space than iterative algorithms
New compilers detect "tail recursion" and transform to iteration.

Lack of destructive updating. If structure is changed, may have to make an entirely new copy (though minimize through sharing).
Results in generating lot of garbage so need garbage collection to go on in background.

"Listful style" - easy to write inefficient programs that pass lists around when single element would be sufficient (though optimization may reduce).

If lazy evaluation need to check whether parameter has been evaluated - can be quite expensive to support.
Need efficient method to do call by name - carry around instructions on how to evaluate parameter - don't evaluate until necessary.

Program run with current implementation of Standard ML of New Jersey is estimated to run only 2 to 5 times slower than equivalent C program. (Uses continuations.)

Lazy would be slower.

What would happen if we designed an alternative architecture based on functional programming languages?

Concurrency

One of driving forces behind development of functional languages.

Because values cannot be updated, result not dependent on order of evaluation.

Therefore don't need explicit synchronization constructs.

If in distributed environment can make copies w/ no danger of copies becoming inconsistent.

If evaluate f(g(x),h(x)) can evaluate g(x) and h(x) simultaneously (w/ eager evaluation).

Two sorts of parallel architectures: data-driven and demand-driven.

Demand driven (like reduction machine) support lazy evaluation.
Data driven (like dataflow architectures) support eager evaluation.

Elts of these are being integrated into parallel computer designs.

Idea is programmer need not put parallel constructs into program and same program will run on single processor and multi-processor architectures.

Not quite there yet. Current efforts require hints from programmer to allocate parts of computation to different processors.

Summary

Functional programming requires alternative way of looking at algorithms.

Referential transparency supports reasoning about programs and execution on highly parallel architectures.

While lose assignment and control/sequencing commands, gain power to write own higher-order control structures (like listify, while, etc.)

Some cost in efficiency, but gains in programmer productivity since fewer details to worry about (higher-level language) and easier to reason about.

Languages like ML, Miranda, Haskell, Hope, etc. support implicit polymorphism resulting in greater reuse of code.

ML features not discussed:

Support for ADT's and separately compiled modules.
Support for exception handling
Automatic storage management via garbage collection

ML currently being used to produce large systems. Language of choice in programming language research and implementation at CMU, Princeton, Williams, etc.

Computational biology: Human genome project at U. Pennsylvania

Lots of research into extensions. ML 2000 report.

Addition of object-oriented features?

Generative description of language.

Language is set of strings. (E.g. all legal ALGOL 60 programs)

Example

    <expression> ::=  <term> | <expression> <addop> <term>   
    <term>       ::=  <factor> | <term> <multop> <factor>   
    <factor>     ::=  <identifier> | <literal> | (<expression>)   
    <identifier> ::=  a | b | c | d   
    <literal>    ::=  <digit> | <digit> <literal>   
    <digit>      ::=  0 | 1 | 2 | ... | 9   
    <addop>      ::=  + | - | or   
    <multop>     ::=  * | / | div | mod | and

Generates: a + b * c + b - parse tree

Grammar gives precedence and which direction op's associate

Extended BNF handy:

item enclosed in square brackets is optional

    <conditional> ::= if <expression> then <statement> [ else <statement> ]

item enclosed in curly brackets means zero or more occurrences

    <literal>::= <digit> { <digit> }

Syntax diagrams - alternative to BNF,
Syntax diagrams are never recursive, use "loops" instead.

Problems with Ambiguity

Suppose given grammar:

	<statement> ::= <unconditional> | <conditional>
	<unconditional> ::= <assignment> | <for loop> | 

                              begin {<statement>} end

	<conditional> ::= if <expression> then <statement> |

                              if <expression> then <statement> else <statement>

How do you parse: if exp1 then if exp2 then stat1 else stat2 ?

Could be

if exp1 then (if exp2 then stat1 else stat2) or
if exp1 then (if exp2 then stat1) else stat2

I.e. What happens if exp1 is true and exp2 is false?

Ambiguous

Pascal rule: else attached to nearest then

To get second form, write:

	if exp1 then 
		begin 
			if exp2 then stat1 
		end 
	else 
		stat2

C, Java have similar ambiguity with "{}".

MODULA-2 and ALGOL 68 require "end" to terminate conditional:

if exp1 then if exp2 then stat1 else stat2 end end
if exp1 then if exp2 then stat1 end else stat2 end

(Algol 68 actually uses fi instead of end)

Why isn't it a problem in ML?

Ambiguity in general is undecidable

Chomsky developed mathematical theory of programming languages:

type 0: recursively enumerable
type 1: context-sensitive
type 2: context-free
type 3: regular

BNF (or syntax diagrams) = context-free can be recognized by push-down automata

Not all aspects of programming language syntax are context-free.

Examples include declaration before use and Go to statement.

Formal description of syntax allows:

programmer to generate syntactically correct programs
parser to recognize syntactically correct programs

Parser-generators (also lexical analysis)- LEX, YACC (available in C and many other languages, e.g., ML), Cornell Program Synthesizer Generator

CS441 | CS Department | Princeton University

Princeton University
Computer Science Dept.

Computer Science 441
Programming Languages
Fall 1998

Lecture 4

More functional languages

Higher-order functions

Program Correctness

Imperative features - references

Implementation issues

Efficiency:

Concurrency

Summary

Major elements of programming languages: Syntax, Semantics, Pragmatics

Syntax:

Formal Grammars:

Example

Problems with Ambiguity

Computer Science 441 Programming Languages Fall 1998 Lecture 4

More functional languages

Problems with Ambiguity

Computer Science 441
Programming Languages
Fall 1998

Lecture 4