O'Caml Programming Basics

These notes introduce a few of the most basic concepts you need to know in order to begin programming in a functional language like O'Caml. However, we focus on central concepts rather than on giving any kind of complete description of the language. You will want to use these notes in conjunction with other resources such as the O'Caml manual, the O'Caml standard library, and our programming style guide. When you finish with these notes, you should also definitely glance at the Pervasive Library -- it is always open and contains many of the most highly used operations and functions. If this tutorial doesn't do it for you, consider looking at Scott Smith's introduction to O'Caml, which has a few more details.

Types, Expressions, Values, Declarations

Basic O'Caml programs are made up 5 different kinds of things: types, values, expressions, declarations and comments. Comments are easy: they begin with (* and end with *). For example:
(* I am a comment.  

     (* Did you know comments can be nested? *)


Some examples of types, values, expressions and declarations are given below.

  • Types: types describe a set of values and operations on those values. Examples:
    • int
    • float
    • char
    • bool
    • string
    • int -> bool (the type of functions with integer arguments and boolean results)
    • int -> bool -> string (the type of functions with two arguments: first an integer and then a boolean; the function result has type string)
  • Values: values are the data that results from executing a computation. Examples:
    • integer values: -44, 0, 1, ...
    • float values: -3.0, 0., 0.0, 2.5e3, 2.5E3, 2500., ...
      • note that 0, 1, -2 are integer values but not float values.
      • float constants must contain a "." to distinguish them from int constants
    • character values: 'a', 'b', '\n'
    • boolean values: true, false
    • string value: "hello, my name is Dave\n"
    • function value: fun x -> x-3
      • A function value with argument x returning a result three less than x.
      • And yes, I'm glad you asked: A function is a value! It is just as much a value as 2 or "hello". It can be a result of a computation.
  • Expressions: expressions are the basis of all computation in a functional language. Each expression has a type and, if it terminates, produces a value. Examples:
    • integer expression: (2 + (-3)*3*5 - 1) / 32
    • float expression: (2.5 +. (-3.)*7.2 -. 1.) /. 32
      • operations such as + and - are integer operations
      • operations such as +. and -. are float operations
      • float operations end with "." to distinguish them from int operations
    • character expression: char_of_int 88
      • The function char_of_int computes the character with the given ASCII code. (In this case, the character with code 88 is 'X'.)
      • There are a number of other conversions between types in the pervasives library. They are typically called type2_of_type1 where type1 is the type of the argument and type2 is the type of the result.
    • boolean expression: not ('a' <= 'b') || (true && 2 != 1)
    • string expression: "hello," ^ " my name" ^ " is Dave\n"
      • The operator ^ concatenates two strings
      • It is left-associative like the integer subtraction operation.
    • function expression: compose (fun x -> x+2) (fun x -> x-1)
      • compose is a function that takes two others functions as arguments. (It is an easy function to write in O'Caml.)
  • Declarations: declarations introduce new variables (ie, names) to stand for types and values. Here are some example declarations:
    (* school is a new type name; it is equal to the type string *)
    type school = string;;
    (* age is a new type name; it is equal to the type int *)
    type age = int;;
    (* compare is a new type name; 
     * it is equal to    age -> age -> bool 
     * and also equal to int -> int -> bool *)
    type compare = age -> age -> bool;;
    (* dylan_age and my_age are new value names;
     * dylan_age is equal to 3; my_age is equal to 39
     * both dylan_age and my_age have type age (which is the same as type int)
     * by the way, "value names" are usually called "variables"; 
     * we will call them that from now on *)
    let dylan_age : age = 3;;
    let my_age : age = dylan_age * 10 + 9;;

O'Caml Toplevel

To begin to play with O'Caml, you can start the O'Caml toplevel interpreter (or just the "toplevel") for short. To do so, simply type ocaml in your shell. If $ is your unix prompt, you should see the following:

$ ocaml
        Objective Caml version 3.12.0


Now, you are in the O'Caml toplevel environment and you can use it like a sophisticated calculator. Type in any O'Caml expression you want, terminated with a double semi-colon. When you press return, the toplevel will determine the type of the expression and evaluate the expression, producing a value. For example, if you type:

# 2 + 3;;
then O'Caml responds with:
- : int = 5
Here is a complete session where the user typed in several expressions as well as some declarations. The session was terminated when the user typed #quit;;. Directives to the O'Caml toplevel always begin with a # symbol.
$ ocaml
        Objective Caml version 3.12.0

# 2+3;;
- : int = 5
# "hello" ^ " " ^ "world";;
- : string = "hello world"
# 2.0 +. 3.8;;
- : float = 5.8
# max_int;;
- : int = 1073741823
# not true;;
- : bool = false
# let x = 2 + 3;;
val x : int = 5
# x + x + 3;;
- : int = 13
# type age = int;;
type age = int
# let my_age : age = 3;;
val my_age : age = 3
# #quit;;


O'Caml Compiler

That was fun, but you need a way to save your work. Like any ordinary programming language, you can enter declarations and expressions in to a file and compile the file. For instance, you can create a file and include the following expression inside it (or just click here to download the file):

print_endline "hello world";;
The print_endline is a function in the pervasive (ie: always open) library that prints a string followed by an end-of-line character. (By the way, the pervasive library is here and you should take a glance through it. You will use it a lot. It contains all of the most common functions on base types like integers, booleans, floats, strings and a few other things.) Ok, now let's compile this file and create an executable named "hello" by typing the following.
ocamlc -o hello
If O'Caml is properly installed, you should be able to run your executable by typing this:
After typing the above and pressing enter, you should see "hello world" printed out:
hello world

Let's try creating a slightly more sophisticated file -- one containing a function. In O'Caml, you can create function simply by using a let declaration. Click here for the file or see below.
(* n^3 *) 
let cube (n:int) : int =
  n * n * n

(* print n and its cube *)
let message (n:int) : string = 
  "The cube of " ^ string_of_int n ^ " is " ^ string_of_int (cube n)

let arg1 : int = 2;;
let arg2 : int = 3;;

(* the main expression *)
print_endline (message arg1);;

The code above contains two functions, cube and message. The basic syntax for writing a function is as follows.

let function-name (argument-name : argument-type) : result-type =
If desired, you can create multi-argument functions simply by adding more argument names with their types. For instance, here is a function called crazy with integer arguments x, y and z.
let crazy (x : int) (y : int) (z : int) : int =
  (x + y) / z
The types of functions and results are optional, but it is good style to include them (particularly on top-level functions). To call a function, you simply write down the function name and its argument. For instance, we called the string_of_int function on an argument n as follows:
string_of_int n
Some languages require parentheses around arguments, but O'Caml does not -- simply place a space between the function name and the argument. However, sometimes we call one function and then use its result as an argument to the next function. We did that when we called string_of_int on the result of calling cube on n:
string_of_int (cube n)
If we had left off the parentheses like this:
string_of_int cube n
O'Caml would have assumed that string_of_int was a 2-argument function that accepted cube as its first argument and n as its second argument. If we then compiled the file, we would see the following error message:
$ ocamlc -o cube
File "", line 8, characters 48-61:
Error: This function is applied to too many arguments;
maybe you forgot a `;'

The main point here is that while parentheses around a function argument are not a necessary part of the function call syntax (like they are in C or Java), they do serve a purpose for defining the extent of an expression. You have all done this before when writing mathematical expressions such as 3*(2+1). The parentheses serve to indicate that the operation 2+1 should be computed first and its result should be multiplied by 3.

Debugging with the Toplevel

The toplevel environment can be very useful when debugging code that you have written down in files. To load the declarations defined in a file (and run the expressions in that file), simply type

#use "filename";;
at the prompt. When you do that, all of the declarations defined in the file become available to you. If you are using emacs, and have Tuareg mode set up, you can also just type C-c C-b when inside a file with a .ml termination. This will open the O'Caml interpreter and read in the file automatically.

As an example, here is an O'Caml session that uses and explores the file. It assumes that is in the directory from which you launched the ocaml interpreter. (If is not in the current directory, you can move from directory to another. See here for more directives, including those for changing directories.)

$ ocaml
        Objective Caml version 3.12.0

# #use "";;
val cube : int -> int = 
val message : int -> string = 
val arg1 : int = 2
val arg2 : int = 3
The cube of 2 is 8
- : unit = ()
# cube 0;;
- : int = 0
# cube 2;;
- : int = 8
# cube 3;;
- : int = 27
# cube arg2;;
- : int = 27
# cube 2 + cube 3;;
- : int = 35
# message 2;;
- : string = "The cube of 2 is 8"
# #quit;;
You will notice that at the top, after we used, the system printed out a list of variables, the values associated with them (where appropriate -- the code for a function value is not printed), and the types of the variables. In this file, the function values include cube and message. The integer values include arg1 and arg2. This line:
The cube of 2 is 8
shows the result of executing the main expression. (There does not have to be just 1 main expression; there can be any number of expressions embedded in a file. The O'Caml toplevel will evaluate all of them.) After using the file, we are free to play with the definitions by typing in further expressions (or declarations) that use them. This is a good way to start debugging declarations and expressions you have just written.

Here is one more simple example function.

let silly (x:int) (y:int) : int =
  (x+y) * (x+y) - (x+y)
That code is silly for a number of reasons. One of the reasons is that good programmers don't write an expression like x+y multiple times -- they write down the expression once and associate it with a variable. In O'Caml, a let expression serves this purpose. Here is an example.
let less_silly (x:int) (y:int) : int =
  let z = x+y in
  z * z - z
The way to think about a let expression is that the variable introduced (z) in this case) is exactly equal to the value computed (x+y in this case) and that variable may be used in the expression following the in keyword. In general, a let expression has the following form:
let variable-name = expression1 in expression2
The variable-name may be used in expression2, but unlike a let declaration (which does not use the keyword "in"), the variable-name may not be used more widely. It is hidden from outsiders. This is a great thing! Information hiding is the basis for modular computation. If you were to type the code for the less_silly in to the toplevel, O'Caml would tell you you had created one declaration:
val less_silly : int -> int -> int = < fun >
It would not mention the inner variable z.

Now, there is no reason to treat integer variables like z any differently than variables with function type. Consequently, local let expressions defining functions are also allowed. Consider the function double_square, which contains within it another function square.

let double_square (x:int) (y:int) : int =
  let square (n:int) : int = n * n in
  square x + square y
Just like z was a local integer variable, square is a local function variable. There is little difference.

In terms of style, all top-level functions like double_square should be annotated with there types. However, it is acceptable to leave the types off of local functions like square. Hence I might rewrite the above as follows.

let double_square (x:int) (y:int) : int =
  let square n = n * n in
  square x + square y

Using Other Modules

We will discuss how to define our own modules later. For now, it suffices to know how to use values defined in other modules in the standard libraries. If the module is named Mod and it defines a value x then one uses dot notation to refer to it: Mod.x. For example, the Char library contains a number of functions for operating over characters. Here, we use them within the toplevel.

$ ocaml
        Objective Caml version 3.12.0
# let lower_x = Char.lowercase 'X';;
val lower_x : char = 'x'
# let upper_x = Char.uppercase lower_x;;
val upper_x : char = 'X'
# #quit;;


We have covered a good chunk of the core syntax of O'Caml so far. Here are the expression forms we have seen:

  • values
  • variables
  • variable defined by another module: ModuleName.var
  • expression with a type annotation: (exp : type)
  • binary operator op: exp1 op exp2
  • function f applied to arguments: f exp1 exp2 ... expk
  • let expression declaring a local variable: let var = exp1 in exp2
  • let expression with a type annotation: let var : type = exp1 in exp2
  • let expression declaring a local function: let f var1 var2 ... vark = exp1 in exp2
  • local function with type annotations:
    • let f (var1:type1) ... (vark:typek) : type_result = exp1 in exp2

Here are the declarations we have seen:

  • type declaration: let type_name = type;;
  • let declaration: let var : type = exp1;;
  • function declaration:
    • let f (var1:type1) ... (vark:typek) : type_result = exp1;;