Princeton University
Computer Science Dept.

Computer Science 441
Programming Languages
Fall 1998

Lecture 9


Problems with Types in Pascal

1. Holes in typing system with variant records, procedure parameters, and files.
		Procedure x(...; procedure y;...)
:
y(a,2);
Fixed in (new) ANSI standard.

No checking if type of file read in matches what was originally written.

2. Problems w/ type compatibility

Assignment compatibility:

When is x := y legal? x : integer, y : 1..10? reverse?

What if type hex = 0..15; ounces = 0..15;

var x : hex; y : ounces;

Is x := y legal?

Original report said both sides must have identical types.

When are types identical?

Ex.:

    Type    T = Array [1..10] of Integer;
    Var  A, B : Array [1..10] of Integer;
             C : Array [1..10] of Integer;
             D : T;
             E : T;
Which variables have the same type?

Name EquivalenceA

Same type iff have same name --> D, E only

Name Equivalence (called declaration equivalence in text)

Same type iff have same name or declared together

--> A, B and D, E only.

Structural Equivalence

Same type iff have same structure --> all same.

Structural not always easy. Let

   T1 = record a : integer; b : real  end; 
   T2 = record c : integer; d : real  end;
   T3 = record b : real; a : integer  end;
Which are the same?

Worse:

   T = record info : integer; next : ^T  end; 
   U = record info : integer; next : ^V  end; 
   V = record info : integer; next : ^U  end; 

Ada uses Name EquivalenceA

Pascal & Modula-2 use Name Equivalence for most part. Check!

Modula-3 uses Structural Equivalence

Two types are assignment compatible iff

  1. have equivalent types or

  2. one subrange of other or

  3. both subranges of same base type.

Ada

Ada's Types

Built-In:

Integer, Real, Boolean, Char, strings.

Enumeration types.

Character and boolean are predefined enumeration types.

e.g., type Boolean is (False, True)

Can overload values:

    Color is (Red, Blue, Green)
    Mood is (Happy, Blue, Mellow)
If ambiguous can qualify w/ type names:
    Color(Blue), Mood(Blue)
Subranges Declared w/range attribute.

i.e., Hex is range 0..15

Other attributes available to modify type definitions:

	Accurate is digits 20
	Money is delta 0.01 range 0.00 .. 1000.00     -- fixed pt!
Can extract type attributes:
	Hex'FIRST -> 1
	Hex'LAST  -> 15
Can initialize variables in declaration:
	declare k : integer := 0

Arrays

"Constrained" - semi-static like Pascal
	type Two_D is array (1..10, 'a'..'z') of Real 
or "Unconstrained" (what we called semi-dynamic earlier)
	type Real_Vec is array (INTEGER range <>) of REAL;
Generalization of open array parameters of MODULA-2.

Of course, to use, must specify bounds,

	declare x : Real_Vec (1..10)
or, inside procedure:
   Procedure sort (Y: in out Real_Vec; N: integer) is -- Y is open array parameter
      Temp1 : Real_Vec(1..N);             -- depends on N
      Temp2 : Real_Vec (Y'FIRST..Y'LAST); -- depends on parameter Y
      begin 
         for I in Y'FIRST ..Y'LAST loop
            ...
         end loop;
         ... 
      end sort;
Note Ada also has local blocks (like ALGOL 60)

All unconstrained types (w/ parameters) elaborated at block entry (semi-dynamic)

String type is predefined open array of chars:

	array (POSITIVE range <>) of character;

Can take slice of 1-dim'l array.

E.g., if

    Line : string(1..80)
Then can write
    Line(10..20) := ('a','b',.'c','d','e','f','g','h','i','j')  
                                         -- gives assignment to slice
Because of this structure assignment, can have constant arrays.

Ada Subtypes and derived types:

Types have static properties - checked at compile time

and dynamic properties - checked at run time

Example of dynamic are range, subscript, etc.

Specify dynamic properties by defining subtype. E.g.,

   subtype digit is integer range 0..9;
Subtypes also constrain parameterized array or variant record.
	subtype short_vec is Real_Vec(1..3);
	subtype square_type is geometric (square)
Subtypes do not define new type, add dynamic constraints.

Therefore can mix different subtypes of same type w/ no problems

Derived types define new types:

	type Hex is new integer 0..15
	type Ounces is new integer 0..15
Now Hex, Ounces, and Integer are incompatible types: treated as distinct copies of 0..15

Can convert from one to other:

	Hex(I), Integer(H), Hex(Integer(G))
Derived types inherit operators and literals from parent type.
	E.g., Hex gets 0,1,2,... +,-,*,...
Use for private (opaque) types and when don't want mixing.

Compare Ada's solutions w/ Pascal's problems:

Helped by removing dynamic features from def of type subrange or index of array.

Can now have open array parameters (also introduced in ISO Pascal).

Variants fixed

Name equivalence in Ada to prevent mixing of different types. E.g., can't add Hex and Ounce.

Can define overloaded multiplication such that if

	l:Length;
	w:Width;
then l * w : Area.

Type completeness principle:

No operation should be arbitrarily restricted in the types of the values involved.

Avoid second-class types.

Ex. in Pascal: Restrictions on return values of functions, lack of procedure variables, etc.

ML comes much closer to satisfying.

Summary of types so far:

postpone ADT's until later

Modern tendency to strengthen static typing and avoid implicit holes in types system.

- usually explicit (dangerous ) means for bypassing types system, if desired

Try to push as many errors to compile time as possible by:

Problem: loss of flexibility which obtainable from dynamic typing or lack of any typing.

Important direction of current research in computer science:

Provide type safety, but increase flexibility.

Important progress over last 20 years:

Polymorphism, ADT's, Subtyping & other aspects of object-oriented languages.

STORAGE

What are storable values of language? Those that cannot be selectively updated.

Varies between languages.

Pascal: primitive (integer, real, char, boolean), sets, pointers

ML: primitive, records, tuples, lists, function abstractions, ref's to vbles.

Examine how variables allocated and lifetime.

Program Units:

Separate segments of code - usually allow separate declaration of local variables.

E.g. Procedures, functions, and blocks (from ALGOL 60 & C, like parameterless procedures located in-line.)

Program unit represented during execution by unit instance, composed of code segment and activation record (gives info on parameters and local variables, and where to return after execution).

Activation Record Structure:

Return address

Access info on parameters

Space for local variables

Units often need access to non-local variables.

How is procedure call made?

To call:

1. Make parameters available to callee.

2. Save state of caller (register, prog. counter).

3. Make sure callee knows how to find where to return to.

4. Enter callee at 1st instruction.

To return:

1. Get return address and transfer execution to that point.

2. Caller restores state.

3. If fcn, make sure result value left in accessible location (register, on top of stack, etc.)

Memory allocation

Three types of languages:

Static: E.g. FORTRAN and COBOL.

Stack-Based: E.g. ALGOL-like languages (including Pascal and C).

Dynamic: LISP, PROLOG, APL, ML, Miranda, Eiffel, etc. as well as aspects of Pascal, Ada, etc.

Static using FORTRAN as example

Units: Main program, Subroutines, and Functions.

All storage (local and global) known at translation time (hence static).

Activation records can be associated with each code segment.

Structure:

Return address

Access info on parameters

Space for local variables

At compile time, both instructions and vbles can be accessed by

(unit name, offset)

At link time can resolve to absolute addresses.

Global info shared via common statement:

COMMON/NAME1/A,B,S(25)

Statement must occur in all units wishing to share information. Name of the block must be identical, though can give different names to variables. (Gives rise to holes in typing) Identifiers are matched in order w/ no checking of types across unit boundaries.

Space for all common blocks allocated and available globally.

Procedure call and return straightforward

Dynamic Memory Management

(See section 10.8 in text)

We cannot use a stack-based discipline for function calls in a functional language because of difficulties in returning functions as values from other functions.

As a result, activation records must be allocated from a heap. Similar difficulties in passing around closures result in most object-oriented languages relying on heap allocated memory for objects. Because it is often not clear when memory can be safely freed, such languages usually rely on an automatic mechanism for recycling memory.

In this lecture we discuss methods for automatically managing and reclaiming free space. We begin with the simpler task of managing free space.

Memory management in the heap

A heap is usually maintained as a list or stack of blocks of memory. Initially all of the free space is maintained as one large block, but requests (whether explicit or implicit) for storage and the subsequent recycling of blocks of memory will eventually result in the heap being broken down into smaller pieces.

When a request is made (e.g., via a "new" statement) for a block of memory, some strategy will be undertaken to allocate a block of memory of the desired size. For instance one might search on the list of free space for the first block which is at least as large as the block desired or one might look for a "best fit" in the sense of finding a block which is as small as possible, yet large enough to satisfy the need.

Whichever technique is chosen, only enough memory as is needed will be allocated, with the remainder of the block returned to the stack of available space.

Unless action is taken, the heap will eventually be composed of smaller and smaller blocks of memory. In order to prevent this, the operating system will normally attempt to merge or coalesce adjacent blocks of free memory. Thus whenever a block of memory is ready to be returned to the heap, the adjacent memory locations are examined to determine whether they are already on the heap of available space. If either (or both) are, then they are merged with the new block and put in the heap.

Even with coalescing, the heap can still become fragmented, with lots of small blocks of memory being used alternating with small blocks in the heap of available space.

This can be fixed by occasionally compacting memory by moving all blocks in use to one end of memory and then coalescing all the remaining space into one large block. This can be very complex since pointers in all data structures in use must be updated. (The Macintosh requires the use of handles in order to accomplish this!)


CS441 | CS Department | Princeton University