COS 226 Programming Assignment 2

Sorting and merging networks

Implement a recursive shuffle-exchange mergesorting network, and using three different implementations of the compare-exchange operation, get it to The purpose of this assignment is to introduce you to a doubly-recursive divide-and-conquer method that has important applications in parallel sorting hardware and large-scale parallel sorting while at the same time giving you experience using abstract operations on data and separately compiled programs.

Throughout this assignment, assume that N, the number of elements to be sorted, is a power of 2.

We start with a standard recursive mergesort, except that we use, for merging, the recursive Batcher's odd-even merge that is described in Section 11.1. These merge and and sort methods have the property that the sequence of compare-exchange operations is completely independent of the data. The double recursion is an easy way to implement the sort, but it also leads to a sequence of compare-exchanges that would seem quite mysterious if we did not know the recursive structure.

The file batcher.c on the course web page gives an implementation of Batcher's method as an index sort, and Sort.c uses it to sort N randomly generated integers. It puts the integers to be sorted in an array data, initializes an index array with a[i]=i, then shuffle and unshuffle the index array during the recursive sort, accessing an array data only indirectly when a compare-exchange routine that takes two integer indices as argument is called. The file batcher.h is an interface that declares the two routines shared by these two programs, compexch, and mergesort. A command like lcc Sort.c batcher.c will compile the two functions; link the compare-exchange calls in the sort to the definition in the main program and the sort call in the main program to the definition in batcher.c; and make an a.out file.

To familiarize yourself with the code, copy these three files, then modify Sort.c to generate and sort random floats between 0 and 1 rather than integers. Also, instrument it to count the compare-exchanges. Run your program for 16, 32, 64, 128, 256, 512, 1024, 2048, 4096 elements and print out the number of compare-exchanges used for each run. For the next two parts of this assignment, you will accomplish different effects by providing quite different implementations of the compare-exchange.

The shuffling is for our convenience in organizing the computation; it is also interesting to know what compare-exchanges the method does if there is no shuffling. Your second task is to write a different driver program that uses a different compare-exchange implementation to print out the compare-exchanges actually done on the data, using the original data indices. You should not change batcher.c or batcher.h at all. The sequences for 2, 4, and 8 elements are:

2: 0-1
4: 0-1 2-3 0-2 1-3 1-2
8: 0-1 2-3 0-2 1-3 1-2 4-5 6-7 4-6 5-7 5-6 0-4 2-6 2-4 1-5 3-7 3-5 1-2 3-4 5-6

See Figure 11.7 for a diagram that contains these networks. They appear in many places in the diagram, for example in the upper left corner. One of the main reasons to study such methods is that many of the compare-exchanges can be done independently in parallel stages. For example, for eight elements, the sort can be finished in six parallel stages:

stage 1: 0-1 2-3 4-5 6-7
stage 2: 0-2 1-3 4-6 5-7
stage 3: 1-2 5-6 0-4 3-7
stage 4: 2-6 1-5
stage 5: 2-4 3-5
stage 6: 1-2 3-4 5-6
Note that there is some flexibility in assigning comparators to stages: for example, 0-4 and 3-7 could be moved from stage 3 to stage 4 in the example just given. Modify your program to print out the appropriate stage number along with each comparator. It might be the case that they do not come out in order by stage, but you can use the system sort filter to fix that problem. Check against the above data for 8 elements, and hand in your output for 16 and 32. In your readme writeup, give a formula for the number of parallel stages needed to sort N elements, when N is 2 to the nth power.

Your final task is to use batcher.c to implement a block sort, as follows. Have your main program divide an array to be sorted into blocks of size M, where M is a power of 2 (and therefore divides evenly into N, which is also a power of 2). First, sort each of the blocks using insertion sort. Next, call mergesort (which is in a separate file and still should not need to be changed at all), but provide an implementation of the compare-exchange abstract operation that merges the two blocks specified, putting the smaller half of the elements in the first block and the larger half of the elements in the second block. (It is amazing but true that this works!) Run your program for 1024, 2048, 4096, 8192, and 16384 elements and with blocksizes 8, 16, and 32.

This is an exercise in algorithm-sharing via separate compilation, not full ADT implementation. You do not need to use make. Submit three different main programs, with different definitions for the compare-exchange function in each. You can use more elaborate mechanisms to share code, but this one will suffice for this assignment.

Extra Credit A. Turn your output in the second part of the assignment into a Postscript program that prints out a graphical representation of the sort as network of N lines, with compare-exchange modules connecting the lines (organized in parallel stages).

Extra Credit B. Get batcher.c to work when N is not a power of 2.

Due: 11:59PM Thursday, February 22.