STRING SORTS STUDY GUIDE


Terminology

Key indexed counting. Allows you to sort N keys that are integers between 0 and R-1 in time proportional to N + R. Beats linearithmic lower bound by avoiding any binary compares. This is a completely different philosophy for how things should be sorted. This is the most important concept for this lecture.

Manually performing LSD and MSD. Should be doable in your sleep.

LSD.

MSD.

3-way String Quicksort.

Suffix sorting.

Recommended Problems

C level

  1. Spring 2012, #6

B level

  1. Fall 2012, #7
  2. Textbook 5.1.8, 5.1.10

A level

  1. Fall 2012, #14
  2. How could we avoid the performance hit from our special charAt() function?
  3. What makes MSD cache unfriendly?
  4. The addBlock() operation is used to add M Strings to an existing sorted data set of N Strings, where M << N. A data set of size N is considered sorted if it can be iterated through in sorted order in N time.

    COS226 student Frankie Halfbean makes two choices. First, he selects a sorted array as the data structure. Secondly, he selects insertion sort as the core algorithm, explaining that insertion sort is very fast for almost sorted arrays. To add a new block of M Strings, the algorithm simply creates an array of length N+M, copies over the old N values into the new array, copies over the new M values to the end of the array, and finally insertion sort is used to bring everything into order. The old array is left available for garbage collection.
    (a) What is the worst case order of growth of the run time as a function of N and M?
    (b) Design a scheme that has a better order of growth for the run time in the worst case. Answers