Caml
Power

## Problem Set 7: Parallel Sequences

This assignment is longer and more open-ended than previous assignments. We have left you a README file in which you may comment on any unusual design decisions you made or problems that you had. You are not required to write lengthy explanations of your code here but you may add any elements that you believe will help the graders understand and appreciate your work. Also, any comments on the assignment and suggestions for next year are appreciated.

### Part 1: Parallel Sequences

A sequence is an ordered collection of elements. OCaml's built-in `list` type is one representation of sequences. In this problem, you will develop another representation for sequences that support a number of parallel operations. For example, the `map` function will apply a function to all the elements of the sequence in parallel.

The following table summarizes the most important operations in the sequence module. The work and span columns describe the complexity requirements of your implementation under several assumptions listed below. The only special case is flatten -- you may use any correct, functional implementation without worrying about the work/span complexity.

Function Work Span Notes
length O(1) O(1)
empty O(1) O(1)
cons O(n) O(1) Second argument has length n.
singleton O(1) O(1)
append O(n+m) O(1) Arguments have length n and m.
tabulate O(n) O(1) Argument has length n.
nth O(1) O(1)
filter O(n) O(log n) Argument has length n.
map O(n) O(1) Argument has length n.
reduce O(n) O(log n) Argument has length n.
repeat O(n) O(1) Second argument is n.
flatten -- -- Sum of lengths of lists contained in argument is n.
zip O(min(n,m)) O(1) Arguments have length n and m.
split O(n) O(1) Argument has length n.

### Complexity Assumptions

The OCaml Thread library provides functions for creating and interacting with independent tasks that may be executed in parallel. In order to implement the operations in the parallel sequences library, you will assume that OCaml threads operate in parallel. You will also have to make the following assumptions:

• Array.make is O(1) work and span.
• The two functions below, multi_create and multi_join, are O(1) work and span.
These costs are not accurate, of course, in the actual OCaml run-time, but they do hold in the idealized parallel machine we assume you would use.

```let multi_create (f:int -> unit) (n:int) : Thread.t array =
let a = Array.make n (Thread.self ()) in
for i=0 to n-1 do
done;
a

let multi_join (a:Thread.t array) : unit =
```

For experts: Array.make is actually performing two tasks: allocation and initialization. Allocation can indeed be peformed in (amortized) constant time. Initialization is trickier -- it would be implemented similarly to multi-create on an idealized parallel machine. For multi_create and multi_join, one can imagine a parallel machine where initializing a thread of execution running the same function on each of its processing units can be done in constant time. Our multi_create lets you do exactly this.

Your task is to build an implementation of sequences that have the work and span listed above. We have provided a complete interface in the `sequence.mli` as well as a reference implementation based on lists in ` sequence.ml`. This implementation is only meant to be a guide — it does not have the time complexities described above, but may be useful for debugging.

To submit: sequence.ml.

### Part 2: Map-Reduce Applications

The following applications are designed to test out your sequence implementation and illustrate the "Map-Reduce" style of computation.

When implementing your routines, you should do so under the assumption that your code will be executed on a parallel machine with many cores. You should also assume that you are computing over a large amount of data (eg: computing your inverted index may involve processing many documents; your perfect matching service may have many profiles). Hence, an implementation that iterates in sequence over all profiles or over all documents will be considered impossible slow --- you must use bulk parallel operations such as map and reduce when processing sets of documents to minimize the span of your computation. Note, however, that we do not care about small constants; you do not need to over-optimize your code. Style and clarity is important too.

### Inverted Index

An inverted index is a mapping from words to the documents in which they appear. For example, if we started with the following documents:

Document 1:

`OCaml map reduce`

Document 2:
`fold filter ocaml`
The inverted index would look like this:

worddocument
ocaml1 2
map 1
reduce 1
fold 2
filter 2

To implement this application, you should take a dataset of documents (such as `data/reuters.txt` or `data/test1.txt`) as input and use the map and reduce operations to produce an inverted index. Complete the `mkindex` function in `apps/inverted_index.ml`. This function should accept the name of a data file and print the index to screen. To get started, investigate some of the functions in the module `Util` (with interface `util.mli` and implementation `util.ml`). In particular, notice that `Util` contains a useful function called `load_documents` --- use it to load a collection of documents from a file. You should print the final result using `Util` function `print_reduce_results`.

Hint: The example above suggests that your inverted index should be case insensitive (OCaml and ocaml are considered the same word). You might find functions such as String.uppercase and/or String.lowercase from the String library useful.

### A Perfect Matching

For the next application, you will implement `apm`, a simple dating service. The files `data/profiles.txt` (long) and `data/test_profiles.txt` (shorter) contain some randomly generated profiles of fictional people seeking partners for a romantic relationship in the files. The type of a profile is `Apm.profile`. We have supplied code in `apm.ml` for parsing data files and printing results to the console. You must write the `matchme` function.

This function takes several arguments as a string array: the data file, the number of matches requested, and the first and last name of the client, who should have a profile in the data file. The map step should pair the client's profile with every other profile in the database and test the match between them. For each pair, compute a compatibility index, which is a `float` between 0. (perfectly incompatible) and 1. (perfectly compatible). We leave the details of this computation up to you. Please include comments as to your reasoning; be respectful and don't get too carried away.

The reduce step should use the compatibility index and the corresponding pairs to create the desired number of matches. The results should then be printed out the using the supplied `print_matches` function. The `Util` module has some functions for reading in files and manipulating strings that may be useful.

Here is a sample run of apm using our solution.

```------------------------------
Client: Jeremiah Sanford
sex: M  age: 23  profession: trucker
nondrinker  nonsmoker
has children  does not want children
prefers a female partner between the ages of 19 and 29
likes classical music and sports

2 best matches:
------------------------------
Compatibility index: 0.691358
Katy Allen
sex: F  age: 21  profession: trucker
nondrinker  nonsmoker
no children  does not want children
prefers a male partner between the ages of 18 and 24
likes country music and sports
------------------------------
Compatibility index: 0.691358
Corina Savage
sex: F  age: 24  profession: trucker
nondrinker  nonsmoker
no children  does not want children
prefers a male partner between the ages of 18 and 32
likes country music and sports
```

Complete the `matchme` and `mkindex` functions making use of the parallel map and reduce functionality provided by your Sequence implementation. You may use your own metric for calculating match compatibility.

You have also been provided with a Makefile in the apps directory that will compile the apps into an executable for testing. For example to run the inverted index, run make (on Unix) and then run:

```  ./main_apps mkindex data/test1.txt
```
and for A Perfect Match run
```  ./main_apps matchme data/profiles.txt 5 John Doe
```

DO NOT change any code already provided to you (especially the parsing functions in apm.ml). Make sure you place any additional functions you write in your submission files.

To submit: apm.ml and inverted_index.ml.

### Part 3: N-Body Simulation

An n-body simulation models the movement of objects in space due to the gravitational forces acting between them. There is a straightforward O(n^2) algorithm that computes the movement of each object by calculating the interactions with every other object. Your task in this problem will be to implement a more efficient algorithm that approximates an n-body simulation. Your solution will build on the code for parallel sequences developed in Part 2.

### Naive Algorithm

In an n-body simulation, we are given a collection of n objects or bodies, each possessing a mass, location, and velocity. The goal of the simulation is to compute, for each time step, the new positions and velocities for each body based on the gravitational forces acting on each.

Assume that we have a module `Plane` that defines representations for scalar values, two-dimensional points, vectors, and common functions such as Euclidean distance (implementing such a module will be a part of this problem). Using `Plane`, we can define a type that represent the mass, position, and velocity of a body,

```  type mass = Plane.scalar
type location = Plane.point
type velocity = Plane.vector
type body = mass * location * velocity
```
and a function `acc` that calculates the acceleration of one body on another:
```  val acceleration : body -> body -> Plane.vector
```

To understand how the `acceleration` function works, we need to review a few basic facts from physics. Recall that force is equal to mass times acceleration (F = m × a) and the gravitational force between objects with masses m and n separated by distance d is given by (G × m × n) / d² where G is the gravitational constant. Putting these two equations together, and solving for a, we have that the magnitude of the acceleration vector due to gravity for the object with mass n is G × m / d²; the direction of the acceleration vector is the same as the direction of the unit vector between the objects. Note that this calculation assumes that the objects do not collide. We will make this simplifying assumption throughout this problem.

Using `acceleration` and the functions from the `Sequence` module, we can then define a function `accelerations` that computes the accelerations on each of the bodies in the simulation:

```val accelerations : body Sequence.t -> Plane.vector Sequence.t
```
Finally, we update the position p and velocity v of each body to p + v + a/2 and v + a respectively, where a is the `Plane.vector` in the sequence returned by `accelerations`.

#### Streams

An n-body simulation produces a stream of results — one "frame" for each step of the simulation. In your solution, you will need to define a function `make_simulation` that creates a stream to represent a simulation. We will use the following datatype to represent streams:
```  type 'a stream = Nil | Cons of 'a * (unit -> 'a stream)
```
The helper functions we have provided for running simulations (discussed next) will access these streams and give you a way to test your code on simple examples.

#### Testing and Visualization

We have provided several modules, helper functions, test cases, and a graphical interface that may be useful while developing your solution. The function `run_simulation` in `Main_nbody` can be used to generate "transcripts" of simulations. The module `Test_nbody` defines the initial values for several simulations. The Java application `bouncy.jar` (depicted in the image above) can be used to animate a simulation using a transcript. Assuming you have java installed on your system, simply run

```java -jar bouncy.jar
```
and the GUI will pop up. If you click on the "open" button, you will be able to load a transcript file. Then click the "run" button to execute that transcript.

Your task for the first part of this problem is to implement the naive n-body algorithm. In particular, you will need to implement `plane.ml` and fill in the missing functions in `naive_nbody.ml`.

To submit: plane.ml and naive_nbody.ml.

### The Barnes-Hut Algorithm

The Barnes-Hut algorithm provides a way to approximate an n-body simulation, while dramatically decreasing its cost. The insight behind the algorithm is that the gravitational forces between objects separated by large distances are weak, so they can be approximated without affecting the fidelity of the overall simulation very much.

The algorithm works by grouping bodies into regions and using a fixed threshold θ to determine if it should perform the exact `acceleration` calculation for the each body in the region, or if it should merely approximate `acceleration` using a pseudobody that represents aggregate information about all of the bodies contained in the region. This dramatically reduces the time needed to perform large simulations — asymptotically, from O(n^2) to O(n log n).

#### Bounding Boxes and Quad Trees

To represent regions, we will use bounding boxes comprising two points:

```type bbox = {north_west:Plane.point; south_east:Plane.point}
```
To represent the decomposition of a collection of bodies into regions, we will use the following datatype:
```type bhtree =
Empty
| Single of body
| Cell of body * bbox * (bhtree Sequence.t)
```
Intuitively, `Empty` represents a region with no bodies; `Single(b)` represents a region with a single body `b`; and `Cell(b,bb,qs)` represents a region with pseudobody `b`, bounding box `bb`, and four sub-quadrants `qs`. The position of the pseudobody in a `Cell` is the center of mass (or barycenter) of all objects in the region. The mass of the pseudobody is the total mass of all objects. The velocity of the pseudobody can be arbitrary, as it is not needed for the `acceleration` calculation. In your solution, you will likely have to write a function that decomposes a sequence of n bodies into a `bhtree`.

#### Simulation

The actual Barnes-Hut simulation works as follows. For each body b, the algorithm traverses the top-level `bhtree`. If it encounters a region with more than one body — i.e., a `Cell` node — it checks whether θ ≥ m / d, where m is the total mass of the pseudobody associated with the region, and d is the distance between b and the pseudobody's center of mass. If so, then it treats the entire region as if it were one large pseudobody. Otherwise, it recursively performs the computation on the sub-quadrants of the region. The base cases are the `Empty` and `Single(b)` cases, where the algorithm simply invokes the `acceleration` function as in the naive algorithm.

Your task for the second part of this problem is to implement the Barnes-Hut algorithm. You will need to fill in the missing code in `barnes_hut_nbody.ml`.

To submit: barnes_hut_nbody.ml.

### Handin Instructions

You may do this problem set in pairs. If you do so, both students are responsible for all components of the assignment. Please write both names at the top of all files if you do it in pairs. Please only have 1 member of the pair hand in the assignment to dropbox.

You must hand in these files to dropbox (see link on assignment page):

1. `README`
2. `sequence.ml`
3. `inverted_index.ml`
4. `apm.ml`
5. `plane.ml`
6. `naive_nbody.ml`
7. `barnes_hut_nbody.ml`

### Acknowledgments

This assignment is based on materials developed by Dan Licata and David Bindel originally, and modified by Nate Foster.