COS 226 Programming Assignment

Shellsorting a linked list

Implement Shellsort for a linked list, based on a variant of bubble sort. The rather severe constraints imposed by the singly-linked list organization presents special problems for implementing Shellsort. Your task is to overcome these constraints and develop a simple, elegant implementation that does not sacrifice efficiency or waste space.

Step 1. First, implement a routine to build a list: read integers from standard input, and build a list node for each item consisting of an integer key and a link to the node for the next item. Also, write a routine to print out the contents of the linked list.

Step 2. Next, implement bubble sort: it should take as input a link to the beginning of a list, and should rearrange the nodes of the list so that they are in sorted order. This is not the same as rearranging keys within nodes: that is to be avoided because, in real applications, other information might be associated with the keys, or other data structures might have pointers to the nodes. You may use dummy nodes or circular lists as you see fit, but do not use doubly-linked lists. That is, you should use only one link per node, with a constant extra number of links for bookkeeping.

Step 3. Next, implement a bubble h-sort. You will need to discover an efficient way to do this, even for large h. Make sure that one pass of your bubble h-sort uses at most a constant amount of auxiliary space and runs in time linear in the input size N. Note that h does not count as a constant - this is too much. However, in general, your bubble h-sort routine will still need more than a constant number of passes to produce a file that is h-sorted. You should think about how you are going to solve this problem before sitting down to write the code.

Step 4. Finally, write a Shellsort variant that uses your bubble h-sort. Use Knuth's increment sequence (1, 4, 13, 40, ...) as in Sedgewick, Program 6.5.

Analysis. Use explicit counters to determine how many key comparisons and exchanges are used. Report these counts and the amount of time your program takes to sort random input files of size 100, 1,000, 10,000, 100,000 and 1,000,000. Include in your readme.txt a two-dimensional table with a row corresponding to each file size and columns giving the total number of comparisons and exchanges used. Also, include a column that indicates how many seconds your program takes on each input.

Input and output. The input consists of integers separated by whitespace. You may use the following sample input files to test your program. Your program should output the intermediate stages of the sort for debugging and to provide evidence that your program works. For example, on the input file random10.txt, your program should output:

a.out < random10.txt
       h  pass       cmp      exch    18 79 46 75 99 91 98 53 10 23 
----------------------------------
       4     1                        10 23 46 53 18 79 98 75 99 91 
       4     2        12         5    10 23 46 53 18 79 98 75 99 91 
       1     1                        10 18 23 46 53 75 79 98 91 99 
       1     2                        10 18 23 46 53 75 79 91 98 99 
       1     3        27         7    10 18 23 46 53 75 79 91 98 99 
----------------------------------
   Total     5        39        12
If N is larger than 20, for brevity only print out the number of comparisons and exchanges for each value of h.
a.out < random100000.txt
       h  pass       cmp      exch      
----------------------------------
   88573     2     22854      5722      
   29524     4    281904     49788      
    9841    10    901590     94555      
    3280    16   1547520    148293      
    1093    21   2077047    203626      
     364    29   2889444    274828      
     121    36   3595644    391385      
      40    49   4898040    584519      
      13    46   4599402    634147      
       4    28   2799888    388543      
       1    16   1599984    175669      
----------------------------------
   Total   257  25213317   2951075
Extra Credit A. The method is much more effective if you alternate directions in passes through the list. Make your program change the direction of all the links on the way through the list, so that it goes the other direction on the next pass. Find a good increment sequence for that variant.

Extra Credit B. Design an increment sequence that consistently outperforms the sequences described in the text. Justify your method with numerical experiments.