The Definition of Fun

The Definition of Fun

In this document, we use ordinary font for non terminal symbols in the grammar (eg: id).  We use bold for keywords (eg: let, in).

Lexical Issues

  1. Identifiers (id).  An identifier is a sequence of letters, digits, and underscores, starting with a letter.  Uppercase letters are distinguished from lowercase.  
  2. Comments.  Comments may appear between any two tokens.  Comments start with /* and end with */ and may be nested.
  3. White space.  White space (newline, returncarriage, tab, space characters) may appear between any two tokens.
  4. Numbers (num) are 30-bit positive integers.  (numbers may be preceded by a minus sign, but the minus sign should be lexed as a separate token.)
  5. Fun reserves the following keywords.

     

    fun in let while do
    if then else printint ref
    not int      

Program Structure

A program is a sequence of (mutually recursive) function declarations followed by the keyword in followed by an expression.  A Fun compiler will expect there to be a single Fun program per file.

prog ::= fundecl ... fundecl in exp

Function Declaration Structure

A function declaration specifies the name of the function, its argument, argument type, return type and expression body.  Functions are interpreted in call-by-value evaluation order.  All functions must have different names (argument names may be repeated).

fundecl ::= fun id ( id : tp ) : tp = exp

Type Structure

The types are integers (int), n-ary tuples, single-argument/single-result function and references (ref).  The ref constructor has higher precedence than the function type constructor (->).   The function type constructor is right associative and the ref type constructor is left associative.

tp ::= int | <tp, ..., tp> | tp -> tp | tp ref | (tp)

Expression Structure

Expressions are evaluated in left-to-right order.  Expression structure is given by the following BNF. 

un ::= - | not | printint | ! | #i

bin ::= + | - | * | & | || | = | < | :=

exp ::= (exp) | id | num | exp; exp | un exp | exp bin exp | <exp, ..., exp> | exp ( exp )

       | if exp then exp else exp | while exp do exp | let id = exp in exp | ref (exp : tp)

  1. Any expression may be parenthesized: (exp)
  2. Basic values include identifiers, () (the unit value) and numbers.  Sometimes numbers are interpreted as Boolean values (eg: in logical operations; if statements; while loops).  In this case, 0 should be interpreted as "false" and non-zero should be interpreted as "true."
  3. Binary arithmetic and logical operations include &, ||, =, <, +, - , and *.  The & operator is short-circuit conjunction, and || is short-circuit disjunction. Equality (=) operates exclusively over integers.  The assignment operator is :=
  4. Unary operations include - (unary minus), not (logical negation: 0 => 1; non-zero => 0), and printint.  printint takes an integer argument and returns <>, the 0-length tuple.  The unary operator #i extracts the ith field from a tuple.  Tuples are indexed starting with 0.   The i in #i must a non-negative integer and #-0 is not legal.
  5. The operators ref, :=, and ! creates a reference, assigns a value to reference and dereferences a reference respectively.  Notice that when you allocate a reference, you must specify the type of objects held in that ref cell.
  6. If expressions (if exp then exp else exp) execute the "then" branch if the first expression evaluates to a non-zero integer and execute the "else" branch if the first expression evaluates to 0.  The else branch extends as far as possible.  Similarly, the do-expression of the while and in-expression of the let extend as far as possible.
  7. N-ary tuples (N >= 0) are comma-separated sequences separated by angle-brackets:  <exp,...,exp>. 
  8. Precedence levels are as follows:
    1. #i, printint, ref, function call, - (unary)
    2. *
    3. +, - (binary)
    4. =, <
    5. not
    6. &, ||
    7. !
    8. :=
    9. ;
  9. All binary operators are left-associative.
  10. In a call expression, the function argument is surrounded by parens.  Function call is left-associative so f (x) (y) is (f (x)) (y).

Typing

Fun has a strong type system with subtyping.  The following sections define the typing rules for the language.

Subtyping

The judgment (tp1 < tp2) states that tp1 is a subtype of tp2.

 

-------

tp < tp

 

tp1 < tp2     tp2 < tp3

------------------------

tp1 < tp3

 

tp0 < tp0'     ...     tpn < tpn'

---------------------------------

<tp0, ..., tpn>  <  <tp0', ..., tpn'>

 

(0 <= n < k)

-------------------------------------------

<tp0, ..., tpn, ..., tpk>  <  <tp0, ..., tpn>  

 

tp1' < tp1     tp2 < tp2'

---------------------------

tp1 -> tp2 < tp1' -> tp2'

 

tp < tp'     tp' < tp

--------------------

tp ref < tp' ref

Expression Typing

The judgment (G |-- exp : tp) states that expression exp has type tp in context G.  A context G is a finite partial map from identifiers to types.  The context "nil" is the empty context.  The context G [x : tp] is the same as G except it maps identifier x to tp.  The following rules define the expression typing judgement.

------------------

G |-- id : G(id)

 

----------------

G |-- num : int

 

G |-- exp1 : tp1    G |-- exp2 : tp2

---------------------------------------

G |-- exp1 ; exp2 : tp2

 

optype (un) = tp1-> tp2     G |-- exp : tp1

-----------------------------------------------

G |-- un exp : tp2

 

optype (bin) = tp1-> tp2 -> tp3     G |-- exp1 : tp1     G |-- exp2 : tp2

-------------------------------------------------------------------------------

G |-- exp1 bin exp2 : tp3

 

G |-- exp0 : tp0     ...     G |-- expn : tpn

--------------------------------------------

G |-- <exp0, ..., expn> : <tp0, ..., tpn>

 

G |-- exp : <tp0, ..., tpn>       (0 <= i <= n)

-----------------------------------------------

G |-- #i exp : tpi

 

G |-- exp1 : tp1 -> tp2     G |-- exp2 : tp1

-----------------------------------------------

G |-- exp1 exp2 : tp2

 

G |-- exp1 : int     G |-- exp2 : tp     G |-- exp3 : tp

--------------------------------------------------------

G |-- if exp1 then exp2 else exp3 : tp

 

G |-- exp1 : int     G |-- exp2 : <>

--------------------------------------

G |-- while exp1 do exp2 : <>

 

G |-- exp1 : tp1     G [x : tp1] |-- exp2 : tp2

-------------------------------------------------

G |-- let x = exp1 in exp2 : tp2

 

G |-- exp : tp

----------------------

G |-- ref (exp : tp) : tp ref

 

G |-- exp : tp ref

----------------------

G |-- ! exp : tp

 

G |-- exp1 : tp ref     G |-- exp2 : tp

---------------------------------------

G |-- exp1 := exp2 : <>

 

G |-- exp : tp'      tp' < tp

---------------------------  (subsumption rule)

G |-- exp : tp

 

The typeof function is defined by the following table.

 

operator typeof (operator)
- (unary) , not int -> int
printint int -> <>
+ , - (binary) , * , & , || , = , < int -> int -> int

 

Function Declaration Typing

 

G [id2 : tp1] |-- exp : tp2

----------------------------------------------------------------------

G |-- (fun id1 ( id2 : tp1 ) : tp2 = exp)   ==>   id1 : tp1 -> tp2

 

Program Typing

 

contextof(fundecl1 ... fundecln) = G

G |-- fundecl1 ok     ...     G |-- fundecln ok     G |-- exp : int

------------------------------------------------------------------

|-- fundecl1 ... fundecln in exp ok

 

where

 

1.  None of the function declarations (fundecl1...fundecln) are allowed to declare the same function name (argument names may be repeated) and

 

2.  contextof((fun id11 ( id12 : tp11 ) : tp12 = exp1) ... (fun idn1 ( idn2 : tpn1 ) : tpn2 = expn)) =

        nil [id11 : tp11 -> tp12] ... [idn : tpn1 -> tpn2]