Type Checking

Every O'Caml expression has a type. If you have programmed in Java or C, this is a familiar concept. However, types become more important in a functional language like O'Caml because of the focus on "expression-oriented" (ie: pure) computation as opposed to imperative computation that operates by assigning values to mutable variables and data structures. The O'Caml type system is also substantially more precise that than the type systems of C or Java.

In order for an expression to be well-typed, the types expected by functions and operators must coincide with the types of their arguments. For instance, the following expression 2 + 3 is well-typed because the + operator expects a pair of integer arguments and both 2 and 3 have type integer. On the other hand, 2.0 + 3.0 is not well-typed. The type of the + operator does not change -- it continues to expect 2 integer arguments. However, 2.0 and 3.0 both have type float. If we type such an expression in to the toplevel, we will see the following sort of error message.

# 2.0 + 3.0;;
Characters 0-3:
  2.0 + 3.0;;
Error: This expression has type float but an expression was expected of type
The error message says that the plus operator expected an argument of type int, but the argument supplied has type float. As another example, suppose we enter our own add function in to the top-level and then supply a string argument:
# let add (x:int) (y:int) : int = x + y;;
val add : int -> int -> int = 
# add (string_of_int 1) 2;;
Characters 4-21:
  add (string_of_int 1) 2;;
Error: This expression has type string but an expression was expected of type

You will get a fair number of these as you start out. Do not despair! You will learn how to understand these messages and they will soon become a handy debugging aid. If you are having trouble figuring out the type of a sub-expression in your code, you can type it in to the O'Caml toplevel to figure it out:

# string_of_int 1;;
- : string = "1"
Try this with functions too:
# string_of_int;;
- : int -> string = < fun >
The above says that string_of_int is a function from integers to strings. One other technique is to add extra typing annotations to help you figure out the type of an expression. You can add a typing annotation around any expression using the syntax (exp : type). For instance:
# (2 + 3 : int) * 7;;
- : int = 35

Type Checking Rules

Whether or not a expression type checks is governed by a uniform set of rules. Each different kind of expression has one rule for type checking it. For example, there is one rule for assigning a type to each different constant; there is one rule for assigning a type to expression that uses a binary operator; there is one rule for assigning a type to an expression that applies a function to an argument; etc. One of the first steps in becoming a competent O'Caml programmer involves memorizing these rules.

An important property of all type-checking rules is that they are compositional. This means that to type check an expression, one needs to know the types of its subexpressions, but nothing more. Said another way, the type of a subexpression serves as an effective interface for the subexpression -- when it comes to type checking, it tells you all you need to know.

When explaining the O'Caml type checking rules, we write exp : t to mean "the expression exp is well typed and has type t." In the following paragraphs, we explain a few of the most important type checking rules.

Integer Constants. Any integer constant n has type int. For example: 0 : int and 17 : int and -3 : int. There are similar rules for all other classes of constants such as strings, characters and floating point numbers.

Binary Operators. Every binary operator has already been assigned a type with the form t1 -> t2 -> t3 (for some types t1 t2 and t3). To find out the type of an operator, just type it in to the O'Caml toplevel surrounded by parens ( ). For example:

# (+);;
- : int -> int -> int = 
# (/.);;
- : float -> float -> float = 
# (^);;
- : string -> string -> string = 
# (@);;
- : 'a list -> 'a list -> 'a list = 
The last operator @ appends two lists together. It happens to be a polymorphic operator -- any type that begins with an apostrophe (like 'a) is a type variable that may be instatiated with another type. We'll learn more about polymorphic types later.

Now that we know that every binary operator op has a type, consider the expression exp1 op exp2. The rule for type checking such expressions is as follows:

if     op : t1 -> t2 -> t3
and  exp1 : t1
and  exp2 : t2
then exp1 op exp2 : t3

As an example, consider the expression (2 + 3) * 5. To analyze this expression, we will start by analyzing the subexpression 2 + 3:

because + : int -> int -> int
and     2 : int
and     3 : int
we conclude that 2 + 3 : int
Now that we know the type of the subexpression 2 + 3, we can analyze the type of the entire expression:
because   * : int -> int -> int
and (2 + 3) : int
and       5 : int
we conclude that (2 + 3) * 5 : int

Function Application. Consider an expression f exp that applies the function f to the argument exp. The typing rule for such expressions is as follows.

if f : t1 -> t2
and exp : t1
then f exp : t2

As an example, consider the expression bool_of_string "true". Checking in the O'Caml toplevel, we discover the type of bool_of_string:

# bool_of_string;;
- : string -> bool = < fun >
With this knowledge, we analyze the expression as follows.
because   bool_of_string : string -> bool
and               "true" : string
we conclude that (bool_of_string "true") : bool

The function get from the String Module. This function retrieves the nth character in a string. For instance, (get "Dave") 3 returns the character 'e'. How do we understand this expression in terms of the type checking rules we know? The type for get is string -> int -> char and since function types are right associative, this type is equivalent to string -> (int -> char). Here is the analysis:

because get : string -> (int -> char)
and  "Dave" : string
we conclude that (get "Dave") : int -> char
because (get "Dave") : int -> char
and                3 : int
we conclude that ((get "Dave")  3) : char
Note we typically write
(get "Dave") 3
get "Dave" 3
We may do this because functional application is left associative. The expressions with and without parentheses are equivalent.

Generally, speaking, whenever we have a function f : t1 -> t2 -> t3 -> ... -> tk, each time we apply an argument, we strip off one of the argument types. Hence f exp1 has type t2 -> t3 -> ... -> tk when exp1 : t1 and f exp1 exp2 has type t3 -> ... -> tk. When exp1 : t1 and exp2 : t2 and ... and expk : tk then (f exp1 exp2 ... expk) : tk.

Function Declaration. The rule for function declaration is straightforward. Given a function definition like this one:

let f (x1:t1) (x2:t2) ... (xk-1:tk-1) : tk = 
The type of the function is t1 -> t2 -> ... -> tk-1 -> tk provided that body_exp has type tk under the assumption that variables have their given types: (x1:t1) and (x2:t2) and ... and (xk-1:tk-1).

As an example, consider test_squares defined as follows.

let test_squares (x:int) (y:int) : bool = 
  x*x + y*y > 10
test_squares : int -> int -> bool since its body has type bool

Let Declaration. A let declaration is really a special case of the more general function declaration -- an ordinary let-bound variable is a "function" with zero arguments. In general, a let declaration like this one:

let x : t = body_exp;;
is well-typed if body_exp has type t -- the type declared for the variable x.

Let Expressions. A let expression let x = exp1 in exp2 is type-checked similarly to a let declaration. The expression as a whole has a type t2 provided that exp1 : t1 and also that exp2 : t2 under the assumption that x : t1.

Let expressions that define functions rather than simple variables are similar.

Exercise: Use the toplevel environment to discover which of the following expressions are well-typed and which are not. In this exercise, we have left off type annotations on declarations in several places -- O'Caml will infer them for you. However, it is good style to put type annotations on your top-level declarations.

For those expressions that are not well-typed, figure out how to fix them. Observe and remember the error messages that the toplevel supplies you with.

let x = 17.0 in x + x;;

let square (x:int) : int = x**2;;

let add_squares x y = x*x + y*y;;

add_squares (addsquares 2) (addsquares 3);;

let y = string_of_int 7 in y;;

let s = "my lucky number is: " ^ y;;

let s' = s ^ '!';;