Computer Science 226
Algorithms and Data Structures
Spring 2013


Course Information | Assignments | Exercises | Lectures | Exams | Precepts | Study Guide | Booksite

Undirected Graphs
Directed Graphs
MSTs
Shortest Paths
Max flow
String Sorts
Tries
Substring Matching
Regular Expressions
Compression
Reductions and Intractability

Table of links incoming soon to help manage the length of this page.

Study Guide

This page provides a list of what I consider the most important topics and ideas from each lecture. The optional exercises listed on this page are not for a grade. Instead, they are to provide you with a better idea of what we're expecting you to know before the final.

Optional exercises are either C problems, B problems, or A problems. C problems are at the same level as the exercises, and generally require that you only be able to mechanically carry out the algorithms. B problems are harder problems that require you to really know what's going on (the midterm had plenty of these). A problems are problems that either cover tangential material (like how Kosaraju-Shamir works), require deep comprehension of the material, involve very clever tricks, or perhaps even all 3.

Don't be intimidated by its length. I tried to be as comprehensive as possible.

Undirected Graphs

Key Concepts

Graph Representation
There are many possible internal representations of a graph. We discussed three. You should be familiar with the tradeoffs for each.

Important graph problems

You should know how to solve the first three -- not necessarily because these problems are so important, but because you should have at least a few examples of how to use DFS or BFS to solve problems. For the last four, it's good scholarship to be know whether the problem is tractable (polynomial time) and whether there exists an easy to understand tractable algorithm. We may even ask you how these things on an exam. You do not need to be familiar with the details of the best known algorithm for their solution.

One important meta-point is that it is non-trivial to determine the difficulty level of a graph problem. For example, finding an Euler tour is polynomial time and the algorithm is non-trivial but not hideously complex. By contrast, finding a Hamilton tour is an intractable problem. Planarity can be solved in linear time, but it's a rather complex algorithm to follow. Graph isomorphism is particularly notable since its tractability has remained elusive despite decades of research.

Important graph tools

Graph client design pattern (used for 4.1, 4.2, 4.3, and 4.4). Instead of creating highly complex classes that support basic graph operations AND solution of important graph problems, we have a basic class (e.g. Graph) that gives us basic operations, and special purpose clients that solve subproblems on a graph (e.g. ConnectedComponents(Graph g)).

Recommended Problems

C level

  1. Fall 09 Final, #2a
  2. Textbook: 4.1.12

B level

  1. Fall 09 Final, #2b
  2. Fall 10 Final, #2c
  3. Textbook: 4.1.14, 4.1.10
  4. New (4/2/2013): What is the run time for DFS or BFS if we use an adjacency matrix insted of adjacency lists?

A level

  1. Fall 12 Final, #15
  2. Textbook: 4.1.33

Directed Graphs

Key Concepts

Using directed graphs vs. undirected graphs. Know when to use each. This should be fairly natural. For example, you'd use an undirected graph to represent friend connections between Facebook users, and a directed graph to represent following connections on Twitter.

BFS and DFS are directed graph algorithms. Many canonical unweighted digraph problems (e.g. shortest paths and reachability) can be solved using code that is effectively identical to code for solving problems on unweighted graphs.

Important digraph problems.

Important digraph tools.

Recommended Problems

C level

  1. Spring 08 Final, #1a, #1b
  2. Fall 08 Final, #1a, #1b
  3. Fall 10 Final, #3a
  4. Spring 12 final, #3a, #3ab

B level

  1. Spring 08 Final, #1c, #1d
  2. Fall 08 Final, #1c
  3. Fall 10 Final, #3b
  4. Fall 10 Final, #2d, #2e
  5. Textbook: 4.2.10, 4.2.20

A level

  1. Fall 2008, #11
  2. 4.2.19, 4.2.22 (exam will not require you to know why Kosaraju-Sharir works)
  3. 4.2.27
  4. 4.2.40

Just for fun

  1. Write code similar to (or copy) the webcrawler shown in lecture slide 48.

Minimum Spanning Trees

Definition of a weighted graph and an MST.

Cuts.

Manually performing Kruskal's and Prim's algorithms. These are easy and you should be able to do them even under the influence of powerful sedatives.

Why Kruskal's and Prim's algorithms work. You should understand the "Fundamental Idea" slide (#37) in the flipped lecture.

Implementing Kruskal's and Prim's algorithms.

Recommended Problems

C level

  1. Spring 08 Final, #2a, #2b
  2. Fall 08 Final, #2a, #2b
  3. Fall 09 Final, #1b
  4. Fall 09 Final, #3a, #3c
  5. Fall 10 Final, #4a, #4b
  6. Fall 10 Final, #3a, #3b
  7. Spring 12 Final, #4a, #4b
  8. Would Kruskal's or Prim's algorithm work with weighted directed graphs?

B level

  1. Fall 09 Final, #3b
  2. Fall 12 Final, #4a, #4b, #4c, #4d
  3. Textbook 4.3.8 - this is a particularly handy concept!
  4. Textbook 4.3.12
  5. Textbook 4.3.13
  6. Textbook 4.3.20
  7. True or False: Given any two components that are generated as Kruskal's algorithm is running (but before it has completed), the smallest edge connecting those two components is part of the MST.
  8. Textbook 4.3.19 - to be clear, they mean that the PQ is implemented with a sorted list.
  9. Note the solution on the booksite is for an unsorted list. That's also a fine problem.

A level

  1. Spring 08 Final, #3
  2. Fall 09 Final, #3d
  3. Textbook 4.3.26

Just for fun

  1. Figure out if Kruskal's, Prim's Lazy, or Prim's Eager algorithm are faster on "typical" graphs. Use the code from the booksite.

Shortest Paths

Key Concepts

Graph Representation

Shortest path representation

Edge relaxation

Manually performing Dijkstra's, Acyclic Shortest Paths, and Bellman Ford - These should be very easy to manully execute.

Bellman Ford algorithm

Dijkstra's algorithm

Acyclic shortest paths

Recommended Problems

C level

  1. Fall 2009 Final, #4
  2. Fall 2010 Final, #5
  3. Textbook 4.3.1 and 4.4.1

B level

  1. Spring 2008 Final, #11 (this should be very easy if you've done Seam Carving!)
  2. Fall 2011 Final, #4
  3. Spring 2012 Final, #5
  4. Fall 2012 Final, #5
  5. Think about how to solve the undirected version of the shortest paths problem. Why is it easy for graphs with positive edge weights? Why is it hard if we include negative edge weights?
  6. Textbook 4.4.25
  7. Since Prim's and Dijkstra's algorithms are so similar, why do we need to check for cycle in Prim's, but not in Dijkstra's?
  8. Textbook 4.4.42 and 4.4.44

A level

  1. Fall 2008 Final, #11 (I remember getting asked this at a career fair as a freshman!)
  2. Textbook 4.4.34
  3. Textbook 4.4.37

Max-Flow

Under construction

Key Concepts

Graph Representation

Max-flow, min-cut, and Ford-Fulkerson basics

Ford-Fulkerson details

Recommended Problems

C level

  1. Fall 2011, #11
  2. Spring 2012, #11
  3. Fall 2012, #6

B level

  1. Textbook 6.36
  2. Textbook 6.38

A level

  1. Textbook 6.37

String sorts

Key Concepts

Terminology Key indexed counting. Allows you to sort N keys in linear time (better than linearithmic). Beats linearithmic bound by avoiding any comparisons. This is a completely different philosophy for how things should be sorted. This is the most important concept for this lecture.

Manually performing LSD, MSD, and 3-way radix quicksort. Should be doable in your sleep.

LSD.

MSD.

3-way radix quicksort.

Implementation issues.

Recommended Problems

C level

  1. Spring 2012 Final, #6

B level

  1. Fall 2008 Final, #6
  2. Fall 2012 Final, #7
  3. Spring 2008 Final, #12
  4. Textbook 5.1.8, 5.1.10

A level

  1. How could we avoid the performance hit form our special charAt function?
  2. What makes MSD cache unfriendly?
  3. The difference between MSD and 3-way radix quicksort is that when recursion depth is i, MSD fully sorts all keys by the ith character using key indexed counting, and 3-way quicksort merely partitions on the ith character. Both of these processes take linear time, but key-indexed counting gets us closer to our final goal. Succintly state why we opt to use the equally slow but less-powerful partitioning procedure.
  4. Textbook 5.1.22. The results may surprise you.

Tries

Key Concepts

Terminology and Basics Hash Tables and LLRBs with String keys Tries TSTS

Recommended Problems

C level

  1. Spring 2008 Final, #5
  2. Spring 2008 Final, #4
  3. Fall 2011 Final, #8
  4. Fall 2012 Final, #8
  5. Textbook 5.2.3, 5.2.4

B level

  1. Fall 2009 Final, #5
  2. Spring 2012 Final, #9
  3. Textbook 5.2.21 (just design the API)
  4. When would we want to use an R-way trie instead of a TST?
  5. Give a worst case input that causes a TST to be unbalanced (and thus slow).
  6. Is the number of character compares in an R-way trie strictly better than for an LLRB? For a hash table? Is a trie guaranteed to be faster in overall run time than an LLRB or a hash table?

A level

  1. When might we want to use a hash table instead of a TST?
  2. What would happen if we had an R-way trie where we keep our next nodes in a linked list? How would this affect memory performance? Timing? What if we used an LLRB? A hash table?
  3. Repeat Spring 2013 midterm problem #8, but assume your keys are Strings instead of Comparables.

Substring Matching

Key Concepts

Terminology and Basics KMP Boyer-Moore Rabin Karp

Recommended Problems

C level

  1. Spring 2008 Final, #6
  2. Fall 2009 Final, #6
  3. Fall 2012 Final, #10

B level

  1. Fall 2009 Final, #5
  2. Fall 2009 Final, #10 [great problem!]
  3. Fall 2010 Final, #10
  4. Fall 2011 Final, #6
  5. Spring 2012 Final, #7
  6. Fall 2012 Final, #9
  7. Give an example of when you might want to use KMP? Boyer Moore? Rabin Karp?

A level

  1. Fall 2010 Final, #10
  2. Fall 2012 Final, #14
  3. Give an example of when you might prefer to use the Monte Carlo version of Rabin Karp over the Las Vegas version.
  4. Textbook: 5.3.22
  5. Textbook: 5.3.26

Regular Expressions

Key Concepts

Terminology and Basics NFA Simulation NFA Construction

Recommended Problems

C level

  1. Spring 2008 Final, #8
  2. Fall 2008 Final, #8
  3. Fall 2009 Final, #7
  4. Fall 2010 Final, #9
  5. Spring 2012 Final, #10
  6. Spring 2012 Final, #11
  7. 5.4.1, 5.4.2

B level

  1. Fall 2011 Final, #7
  2. Textbook 5.4.16, 5.4.17, 5.4.18

A level

None!

Just for fun

  1. 5.4.13

Data Compression

Key Concepts

Terminology and Basics Run length coding Huffman coding LZW Kolmogorov Complexity

Recommended Problems

C level

  1. Spring 2008 Final, #4
  2. Fall 2008 Final, #7
  3. Fall 2011 Final, #10b
  4. Textbook 5.5.3
  5. Why is the Kolmogorov complexity of "a" (relatively) easy to find?

B level

  1. Fall 2011 Final, #10a
  2. Textbook 5.5.13
  3. Textbook 5.5.17
  4. Is the Kolmogorov complexity of π small or large?

A level

  1. Fall 2012 Final, #13
  2. Textbook 5.5.23 (don't know the answer myself, I'm quite curious)

Reductions and Intractability

Key Concepts

Recommended Problems

Note: The Spring 2013 final will not feature one of those really sneaky reduction problems (typically part a of the B level and A level problems below). However, we will probably ask you about how reductions work.

C level

  1. Textbook 6.52

B level

  1. Textbook 6.57
  2. Textbook 6.61, but replace "search" with "decision" to match our terminology from class.
  3. Textbook 6.62
  4. Spring 2008 Final, #8
  5. Fall 2008 Final, #12
  6. Fall 2009 Final, #11
  7. What does the Venn diagram look like for NP-Complete and NP?
  8. If P=NP, then why is the set of NP-Complete problems the same as P?

A level

  1. Fall 2010 Final, #10
  2. Fall 2011 Final, #13
  3. Spring 2012 Final, #13
  4. Fall 2012 Final, #15