PROGRAMMING ASSIGNMENT 2

Sorting and merging networks

Implement a recursive shuffle-exchange mergesorting network, and using three different implementations of the compare-exchange operation, get it to The purpose of this assignment is to introduce you to a doubly-recursive divide-and-conquer method that has important applications in parallel sorting hardware and large-scale parallel sorting while at the same time giving you experience using abstract operations on data and separately compiled programs.

Throughout this assignment, you may assume that the number of elements to be sorted N is a power of two. For sorting, use a single compare-exchange for N=2 and standard recursive mergesort for N>2. For merging, use a recursive method based on Batcher's odd-even merge that can be described as follows:

The following table gives an example of how the merge proceeds for a small set of sample keys.
A S O R T I N G A E X A M P L E (initial input)
A G I N O R S T A A E E L M P X (sort halves (recursively))
A I O S A E L P G N R T A E M X (unshuffle to get two merging problems)
A A E I L O P S A E G M N R T X (do merges (recursively))
A A A E E G I M L N O R P T S X (shuffle)
A A A E E G I L M N O P R S T X (even-odd compare-exchange)
This sort has the property that the sequence of compare-exchange operations is completely independent of the data. The double recursion is an easy way to implement the sort, but it also leads to a sequence of compare-exchanges that would seem quite mysterious if we did not know the recursive structure.

First, implement this method as a standard index sort. Initialize an index array with a[i]=i, then shuffle and unshuffle the index array during the recursive sort, accessing an array data only indirectly when a compare-exchange routine that takes two integer indices as argument is called. You may use an auxiliary array to implement the shuffle and unshuffle operations in a straightforward fashion. Your main task is to implement and debug the two recursive divide-and-conquer algorithms (the sort and the merge).

Put the implementation of the sort and merge in one file and the main program, data and index arrays, and compare-exchange implementation in another. The next two parts of this program will accomplish different effects by providing quite different implementations of the compare-exchange. Run your program for 16, 32, 64, 128, 256, 512, 1024, 2048, 4096 elements and print out the number of compare-exchanges used for each run.

The shuffling is for our convenience in organizing the computation; it is also interesting to know what compare-exchanges the method does if there is no shuffling. Your second task is to write a different driver program that uses a different compare-exchange implementation to print out the compare-exchanges actually done on the data, using the original data indices. Your Batcher's mergesorting program should be compiled separately and should not be changed at all. The sequences for 2, 4, and 8 elements are:

2: 0-1
4: 0-1 2-3 0-2 1-3 1-2
8: 0-1 2-3 0-2 1-3 1-2 4-5 6-7 4-6 5-7 5-6 0-4 2-6 2-4 1-5 3-7 3-5 1-2 3-4 5-6

One of the main reasons to study such methods is that many of the compare-exchanges can be done independently in parallel stages. For example, for eight elements, the sort can be finished in six parallel stages:

stage 1: 0-1 2-3 4-5 6-7
stage 2: 0-2 1-3 4-6 5-7
stage 3: 1-2 5-6 0-4 3-7
stage 4: 2-6 1-5
stage 5: 2-4 3-5
stage 6: 1-2 3-4 5-6
Modify your program to print out the appropriate stage number along with each comparator. It might be the case that they do not come out in order by stage, but you can use the system sort filter to fix that problem. Check against the above data for 8 elements, and hand in your output for 16 and 32. In your writeup, give a formula for the number of parallel stages needed to sort 2^n elements.

The third task is to use the method to implement a block sort, as follows. Have your main program divide an array to be sorted into blocks of size M, where M is a power of 2 (and therefore divides evenly into M). First, sort each of the blocks using insertion sort. Next, call your Batcher's mergesort (which is in a separate file and still should not need to be changed at all), but provide an implementation of the compare-exchange abstract operation that merges the two blocks specified, putting the smaller half of the elements in the first block and the larger half of the elements in the second block. (It is amazing but true that this works!) In your writeup, give amount of time it would take p processors to sort N words if an M by M merge takes M time units on one processor, and we assume that the overhead to get the processors working on the merging problems is negligible by comparison with the time to do the merges.

This is an exercise in separate compilation, not full ADT implementation. You do not need to use a .h file or make. Simply put a declaration for the compare-exchange function in your Batcher's mergesort program, with different definitions for the function in each of the three main programs. Then a command like lcc main1.c batcher.c will compile the two functions, link the compare-exchange calls in the sort to the definition in the main program, and make an a.out file. You can use more elaborate mechanisms, but this will suffice for this assignment.

Extra Credit A. Do the shuffle and unshuffle operations without using any extra space.

Extra Credit B. Turn your output in the second part of the assignment into a Postscript program that prints out a graphical representation of the sort as network of N lines, with compare-exchange modules connecting the lines (organized in parallel stages).

Extra Credit C. Get your program to work when N is not a power of two.

Due: 11:59PM Friday, February 23.