Princeton University
Computer Science Dept.

Computer Science 441
Programming Languages
Fall 1998

Lecture 23


Contents:

  1. Semantics of programming languages
    1. Operational Semantics
    2. Axiomatic Semantics
    3. Denotational Semantics

Semantics of programming languages

Give a brief survey of semantics specification methods here.
  1. Operational
  2. Axiomatic
  3. Denotational

Operational Semantics

May have originated with idea that definition of language be an actual implementation. E.g. FORTRAN on IBM 704.

Can be too dependent on features of actual hardware. Hard to tell if other implementations define same language.

Now define abstract machine, and give translation of language onto abstract machine. Need only interpret that abstract machine on actual machine to get implementation.

Ex: Interpreters for PCF. Transformed a program into a "normal form" program (can't be further reduced). More complex with language with states.

Expressions reduce to pair (v,s), Commands reduce to new state, s.

E.g.

    (e1, rho, s) => (m, s')    (e2, rho, s') => (n, s'')
    ----------------------------------------------------
               (e1 + e2, rho, s) => (m+n, s'')

            (M, rho, s') => (v, s'')
    ----------------------------------------
    (X := M, rho, s) => (rho, s''[v/rho(X)])


    (fun(X).M, rho, s) => (< fun(X).M, rho >, s)

    (f,rho,s) => (<fun(X).M, rho'>, s')   (N,rho,s') => (v,s''),    
                (M, rho' [v/X], s'') => (v', s''' )
    ------------------------------------------------------------                  
                    (f(N), rho, s) => (v', s''' )

Meaning of program is sequence of states that machine goes through in executing it - trace of execution. Essentially an interpreter for language.

Very useful for compiler writers since very low-level description.

Idea is abstract machine is simple enough that it is impossible to misunderstand its operation.

Note that the text uses a different kind of operational semantics, sometimes called structured operational semantics or "little-step" semantics. They express a bit more clearly where in an expression the next reduction should take place. They work by iteratively making reductions in selected parts of a term.

The "natural" semantics we have been using are often called "big-step" semantics because it involves recursive calls that reduce an entire subterm and may involve many reductions. It works recursively rather than iteratively.

However, it is usually relatively straightforward to show the two kinds of semantics are the same for the core elements of a language.

Axiomatic Semantics

No model of execution.

Definition tells what may be proved about programs. Associate axiom with each construct of language. Rules for composing pieces into more complex programs.

Meaning of construct is given in terms of assertions about computation state before and after execution.

General form:

			{P} statement {Q}
where P and Q are assertions.

Meaning is that if P is true before execution of statement and statement terminates, then Q must be true after termination.

Assignment axiom:

	{P [expression / id]} id := expression  {P}

e.g.

		{a+17 > 0} x := a+17 {x > 0}
or
		{x > 1} x := x - 1 {x > 0} 

While rule:

	If {P & B} stats {P}, then {P} while B do stats {P & not B}

E.g. if P is an invariant of stats, then after execution of the loop, P will still be true but B will have failed.

Composition:

	If {P} S1 {Q}, {R} S2 {T}, and Q => R, 
		then {P} S1; S2 {T}

Conditional:

	If {P & B} S1 {Q}, {P & not B} S2 {Q}, 
			then {P} if B then S1 else S2 {Q}

Consequence:

	If P => Q, R => T, and {Q} S {R},
			then {P} S {T}

Prove program correct if show

	{Precondition} Prog {PostCondition}

Often easiest to work backwards from Postcondition to Precondition.

Ex:

	{Precondition: exponent0 >= 0}
	base <- base0
	exponent <- exponent0
	ans <- 1
	while exponent > 0 do
		{assert:  ans * (base exponent) = base0 exponent0}
		{           & exponent >= 0}
		if odd(exponent) then
				ans<- ans*base
				exponent <- exponent - 1
			else
				base <- base * base
				exponent <- exponent div 2
		end if
	end while
	{Postcondition: exponent = 0}
	{               & ans = base0exponent0}

Let us show that:

	P =  ans * (baseexponent) = (base0exponent0) & exponent >= 0
is an invariant assertion of the while loop.

The proof rule for a while loop is:

	If {P & B} S {P}  then  {P} While B do S {P & not-B}
We need to show P above is invariant (i.e., verify that {P & B} S {P}).

Thus we must show:

{P & exponent > 0}
if odd(exponent) then
                ans<- ans*base
                exponent <- exponent - 1
            else
                base <- base * base
                exponent <- exponent div 2
        end if
{P}
However, the if..then..else.. rule is:
    if {P & B} S1 {Q} and {P & not-B} S2 {Q} then 
                                    {P} if B then S1 else S2 {Q}.
Thus it will be sufficient if we can show
(1) {P & exponent > 0 & odd(exponent)}
            ans<- ans*base; exponent <- exponent - 1 {P} 
and
(2) {P & exponent > 0 & not-odd(exponent)}
            base <- base * base; exponent <- exponent div 2 {P}
But these are now relatively straight-forward to show. We do (1) in detail and leave (2) as an exercise.

Recall the assignment axiom is {P[exp/X]} X := exp {P}.

If we push P "back" through the two assignment statements in (1), we get:

{P[ans*base/ans][exponent - 1/exponent]} 
                ans<- ans*base; exponent <- exponent - 1 {P}
But if we make these substitutions in P we get the precondition is:
    ans*base* (base(exponent - 1)) = base0exponent0 
            & exponent - 1 >= 0
which can be rewritten using rules of exponents as:
    ans*(baseexponent) = base0exponent0 & exponent >= 1
Thus, by the assignment axiom (applied twice) we get
(3) {ans*(baseexponent) = base0exponent0 & exponent >= 1}
            base <- base * base; exponent <- exponent div 2 {P}

Because we have the rule:

    If {R} S {Q} and R' => R  then {R'} S {Q}
To prove (1), all we have to do is show that
(3)     P & exponent > 0 & odd(exponent) => 
                    ans*(baseexponent) = base0exponent0 
                    & exponent >= 1
where P is
    ans*(baseexponent) = (base0exponent0) & exponent >= 0.
Since ans * (baseexponent) = (base0exponent0) appears in both the hypothesis and the conclusion, there is no problem with that. The only difficult is to prove that exponent >= 1.

However exponent > 0 & odd(exponent) => exponent >= 1.

Thus (3) is true and hence (1) is true.

A similar proof shows that (2) is true, and hence that P truly is an invariant of the while loop!

Axiomatic semantics due to Floyd & Hoare, Dijkstra also major contributor. Used to define semantics of Pascal [Hoare & Wirth, 1973]

The text uses weakest pre-conditions, invented by Dijkstra. The idea is that P = wp(S,Q) iff {P} S {Q} and, if {P'} S {Q}, then P' => P. Thus P is the "weakest" precondition which will guarantee that Q is true after running S. It is the weakest in that any other pre-condition is stronger than it, in the sense that any other pre-condition will imply it.

The advantage of weakest pre-condition is that it is easier to mechanically implement them. If P is the pre-condition of program S, and Q is the post-condition, the {P}S{Q} iff P => wp(S,Q) . Thus the algorithm to prove the correctness of the program is to calculate wp(S,Q) and then use a regular theorem prover to prove the implication (ignoring the fact that it is about a program).

Unfortunately, wp(S,Q) cannot be computed entirely mechanically, at least not if you want a finite answer. As a result, human intervention is oftern required to find loop invariants as above. See the text for details.

In summary, axiomatic semantics are at too high a level to be of much use to compiler writers (unlike operational semantics. However, they are perfect for proving programs correct.

Denotational Semantics

Mathematical definition of meaning of programming constructs. Find denotation of syntactic elements.

E.g. (4 + 2), (12 - 6), and (2 * 3) all denote the same number.

Developed by Scott and Strachey, late 60's early 70's

Program is defined as a mathematical function from states to states. Use these functions to derive properties of programs (e.g. correctness, soundness of typing system, etc.)

Start with functions defined by simple statements and expressions, combine these to get meaning of more complex statements and programs.

Tiny

Syntactic Domains:

    I in Ide 
    E in NumExp 
    B in BoolExp
    C in Command

Formal grammar

    E ::= 0 | 1 | read | I | E1 + E2
    B ::= true | false | E1 = E2 | not B | fn x => E | E1 (E2)
    C ::= I := E | output E | if B then C1 else C2 | 
                while B do C |  C1; C2

Semantic Domains:

    State   =   Memory x Input x Output
    Memory  =   Ide -> [Value + {unbound}]
    Input   =   Value*
    Output  =   Value*
    Value   =   Nat + Bool + (Nat -> Value)
where Nat = {0, 1, 2, ...} is as defined above, and Bool = {true, false}

We assume that the following built-in functions are defined on the above domains:

    and : Bool x Bool -> Bool, 
    if...then...else... : Bool x Y x Y -> Y + {error} 
                                 for any semantic domain Y,     
    =  : Value x Value -> Bool,
    hd : Value* -> Value, 
    tl : Value* -> Value*, 
where each of these has the obvious meaning.

In the denotational semantics given below, we use s or (m, i, o) for a typical element of State, m for Memory, i for Input, o for Output, and v for Value.

Denotational Definitions:

We wish to define:

Note that in the first instance the valuation of an expression may

  1. result in an error,
  2. depend on the state, or
  3. cause a side effect.

Therefore we will let E have the following functionality:

        E : NumExp -> [State -> [[Value x State] + {error}]]
where we write
  [[E]]s = (v,s') where v is E's value in s & s' is the state 
                               after evaluation of E.
      or = error, if an error occurs

B and C are defined similarly. The three functions are defined below.

Define E : NumExp -> [State -> [[Value x State] + {error}]] by:

    E [[0]]s = (0,s)

    E [[1]]s = (1,s)

    E [[read]](m, i, o) = if (empty i) then error, 
                                        else (hd i, (m, tl i,  o))

    E [[I]](m, i, o) = if m i = unbound  then  error, 
                                        else  (m I, (m, i, o))

    E [[E1 + E2]]s = if (E [[E1]]s = (v1,s1) & E [[E2]]s1 = (v2,s2))  
                                then (v1 + v2, s2)                                       else error

    E [[fn x => E]]s  = fun n in Nat. E [[E]](s[n/x])

    E [[E1 (E2)]]s = E [[E1]]s (E [[E2]]s)
Note difference in meaning of function here from that of operational semantics!

Define B: BoolExp -> [State -> [[Value x State] + {error}]] by:

    B [[true]]s = (true,s)
    
    B [[false]]s = (false,s)
    
    B [[not B]]s = if  B [[B]]s = (v,s') then  (not v, s'), 
                                          else  error
                        
    B [[E1 = E2]]s = if  (E [[E1]]s = (v1,s1)  & E [[E2]]s1 = (v2,s2))  
                               then  (v1 = v2, s2)
                               else error

Define C : Command -> [State -> [State + {error}]] by:

    C [[I := E]]s = if  E [[E]]s = (v,(m,i,o)) then (m[v/I],i,o)
                                                else  error
where m' = m[v/I] is identical to m except the value of I is v.
    C [[output E]]s = if  E [[E]]s = (v,(m,i,o)) then (m,i,v.o)
                                                  else  error
where v.o is the result of attaching v to the front of o.
    C [[if E then C1 else C2]]s = if  B [[B]]s = (v,s')  
                                            then  if  v  then  C [[C1]]s'
                                                         else  C [[C2]]s'
                                            else  error
                                            
    C [[while E do C]]s = if  B [[B]]s = (v,s') 
                                     then if v then if  C [[C]]s' = s''  
                                                    then C [[while E do C]]s''
                                                    else  error
                                               else  s'
                                     else  error
                                   
    C [[C1; C2]]s = if  C [[C1]]s = error then error
                                           else  C [[C2]] ( C [[C1]]s)
End Tiny

Notice that definition of while is a recursive definition.

Thus, if B [[B]] s = True and s' = C [[S]] s, then C[[while B do S]]s = C [[while B do S]]s'

Solution involves computing what is known as least fixed points.

Denotational semantics gained considerable popularity over last 15 years.

Many people consider denotational semantics as most appropriate for studying meaning of language independent of its implementation.

Good for looking at trade-offs between alternate forms of similar constructs.

Which is best?

No good answer, since have different uses.

Complementary definitions. Can be compared for consistency.

Programming language definitions usually still given in English. Formal definitions often too hard to understand or require too much sophistication. Gaining much more acceptance. Now relatively standard intro graduate course in CS curricula.

Some success at using formal semantics (either denotational or operational) to automate compiler construction (similar to use of formal grammars in automatic parser generation).

Semantic definitions have also proved to be useful in understanding new programming constructs.