NAME: Final Exam May 24, 2001 Computer Science 226

Warning: these solutions were written rather quickly, so if you notice any typos or bugs, please let us know.

1. The following table describes a pattern-matching automaton with start state 0 and accept state 3. Describe the language recognized by this automaton.

```                    0  1  2  3
-------------
0 |  0  2  0  3
1 |  1  1  3  3
```
All bitstrings that contain the substring 101.

2. The following table gives the KMP state transitions for the string 10100110, except for one state transition, labelled X in the table. What should the value of X be?

```                 0  1  2  3  4  5  6  7
------------------------
0 |  0  2  0  4  5  0  2  8
1 |  1  1  3  1  X  6  7  1
```
X = 3.

3. It is easy to build a nondeterministic FSA corresponding to a regular expression, and a basic theorem from automata theory says that we can convert every nondeterministic FSA to a deterministic one. Why do we not do so to implement RE pattern matching? Circle one of the following choices.

```    A. The deterministic FSA might have an exponential number of states.

B. There might exist a string that the deterministic FSA
recognizes but the nondeterministic FSA does not.

C. There might exist a string that the nondeterministic FSA
recognizes but the deterministic FSA does not.

D. The nondeterministic FSA might loop.

E. The proof of the theorem is not constructive (does not
tell us how to construct the deterministic FSA).
```
A.

4. Fill in the following table with dictionary entries that could be produced for a Huffman encoding of the string accctggccccccg.

```        a 110
c 0
g 10
t 111
```

5. Which of the following best describes the asymptotic growth (within a constant factor) of the length of the code produced by the two versions of Lempel-Ziv encoding for a string of 8N characters made up of copies of the string abcdefgh ? Give two answers, one for the pure version (LZ) and the other for the dictionary-based version (LZW).
```
A.  log N

B.  (log N)*(log N)

C.  square root of N

D.  N * log N

E.  N
```

```
LZ    A

LZW   C
```

LZ compression encodes the next character(s) by outputting the next character if there is no previous match or by outputting the number of characters back to the longest previous match. Thus, the second copy of abcdefgh is produced with just one codeword. The third and fourth copies together are produced with a single codeword. At each iteration the number of characters represented by a codeword doubles. Hence the codelength is O(log N).

6. Draw the subdivision of the plane that corresponds to inserting the following points into an initially empty 2D tree, in the order given.

`   (2, 1)  (4, 4)  (5, 2)  (10, 2)  (7, 5)  (1, 6)  (8, 4) `

Here is the 2D tree since the subdivision is a bit hard to draw in ASCII. The solution below makes an arbitrary decision due to the degeneracy caused by (5, 2) and (10, 2).

```
(2, 1)
\
(4, 4)
/    \
(1, 6)  (5, 2)
\
(10, 2)
/
(7, 5)
/
(8, 4)
```

7. Suppose that the Graham scan is used to find the convex hull of the points

`   (1, 1)  (4, 5)  (3, 3)  (5, 3)  (6, 1)  (7, 3)  (4, 0)  (3, 2) `
Assume that the scan is counter-clockwise, starting from the point with smallest y-coordinate. Give the order of points being scanned, and, for each point that is not on the convex hull, the point most recently added to the hull when is it eliminated in the scan.

```  Order  Eliminated by
----------------------
(4, 0)
(6, 1)
(7, 3)
(5, 3)       (4, 5)
(4, 5)
(3, 3)       (1, 1)
(3, 2)       (1, 1)
(1, 1)
```

8. What is the minimum number of lines that can appear in the Voronoi diagram of four different points? Draw a small diagram to justify your answer.

It seems to depend on whether you count lines or line segments.

```       |
.   |    .                            |       |       |
|                                 |       |       |
-----------------                    .   |   .   |   .   |   .
|                                 |       |       |
.   |    .                            |       |       |
|

2 lines, 4 line segments             3 lines, 3 line segments
```
9. Consider the digraph on eight nodes, labelled 0 through 7, with the 13 directed edges
```0-7 0-1 1-4 1-6 2-3 3-4 4-2 5-2 6-0 6-3 6-5 7-1 7-3
```
Draw the DFS tree for the standard adjacency-matrix representation. List the edges of each type in the space provided below.
```0-7   tree                          0
0-1   tree                         / \
1-4   tree                        1   7
1-6   tree                       / \
2-3   tree                      4   6
3-4   back                      |   |
4-2   tree                      2   5
5-2   cross                     |
6-0   back                      3
6-3   cross
6-5   tree
7-1   cross
7-3   cross
```

10. List the strongly connected components of the digraph of Problem 9.

There are 3 strongly connected components.

```0-1-6-7
2-3-4
5
```

11. Classify each of the following graph-processing problems as easy or intractable, by writing E or I, respectively, in the space provided.

```
I  Find the largest complete subgraph.

E  Find a set of edges that separates the graph.

I  Find a path that uses each vertex exactly once, or report that none exists.

E  Find a path that uses each edge exactly twice, or report that none exists.

E  Find the smallest set of edges that connects all the vertices.

E  Find the vertices in a digraph that are not on a directed cycle.
```

12. Match each of the given graph-processing algorithms with the problem that it solves. Write the letter corresponding to your answer in the blank to the left of the name of the algorithm. In some cases, more than one answer might be argued, but you must put only one letter per algorithm. If you pick the best matches, you will use all the letters.

```E  Bellman-Ford      A. MST for dense graphs

B  Ford-Fulkerson    B. maxflow

D  Kruskal           C. transitive closure

H  network simplex   D. MST for sparse graphs

G  Floyd             E. negative-cycle detection

A  Prim              F. strong components

C  Warshall          G. all-pairs shortest paths in dense graphs

I  Dijkstra          H. mincost maxflow

F  Kosaraju          I. single-source shortest paths

```

13. Consider the following undirected graph

```     (0-1, 1)  (0-4, 4)  (0-6, 1)  (0-5, 2)  (1-2, 1)  (1-4, 2)
(1-6, 3)  (2-3, 2)  (2-5, 3)  (3-4, 1)  (4-5, 1)  (5-6, 3)
```
where (i-j, w) means that the edge that connects node i to node j has weight w. Give the parent-link representation of the MST computed by Prim's algorithm, using the standard adjacency-matrix representation.

The MST has cost 7. The edges depend on how degeneracy is resolved on the priority queue. In particular, the MST must use exactly one of the weight 2 edges 0-5, 1-4, or 2-3.

```
vertex | 0  1  2  3  4  5  6
----------------------------------
parent-link | 0  0  1  4  1  4  0
```

14. Consider the edges given in question 13 to be directed edges. Give the parent-link representation of the shortest-paths tree from node 0 that is computed by the PFS implementation of Dijkstra's algorithm for the graph of the previous question, using the standard adjacency-lists representation.

```
vertex | 0  1  2  3  4  5  6
----------------------------------
distance | 0  1  2  4  3  2  1
parent-link | 0  0  1  2  1  0  0
```

15. Prove the following statement or give a counterexample: "For any weighted graph G there is node v in G such that a Shortest Path Tree with v as source is identical to a Minimum Spanning Tree of G."

False. Note that the MST will always avoid the largest weight edge in a cycle (assuming there is a unique largest), whereas the shortest path tree might use it. Consider the following graph where the diagonal edges have weight 3 and the other edges have weight 2. A MST consists of three of the weight 2 edges, whereas a shortest path tree will use one of the diagonal edges.

```   A------B
|\    /|
| \  / |
|  \/  |
|  /\  |
| /  \ |
|/    \|
C------D
```
16. Consider the edges in question 13 to be directed edges and the weights on the edges to be capacities. Show a sequence of augmenting paths that could result from the maximum-capacity augmenting-path algorithm in computing a maximum flow from 0 to 6.

The value of the maxflow is 5, which can be achieved by the paths below. A mincut is {0, 4} which confirms that the maxflow is 5.

```
path      amount
----------------
0-5-6          2
0-6            1
0-4-5-6        1
0-1-6          1
```

17. Consider the following network

```     (0-1, 6, 1)  (0-2, 4, 3)  (0-3, 4, 4)  (1-2, 3, 1)  (1-3, 3, 1)
(2-4, 4, 1)  (3-4, 4, 1)  (2-5, 3, 5)  (3-5, 3, 6)  (4-5, 5, 2)
```
where (i-j, w, c) means that there is an edge from node i to node j with capacity w and cost c. Add a dummy edge (5-0, 11, -9) and give a sequence of augmenting cycles that yield a mincost maxflow.

It's not too hard to see that the maxflow value is 11. It is easy to see that the maxflow is at most 11 because there are only 3 arcs entering node 5 and their capacity sums to 11. By inspection, you can verify that it is possible to get 11 units of flow to node 5.

We use a greedy rule to select negative cost cycles, i.e., always choose the most negative cycle. Although this is NP-hard to compute in general, it's fairly easy to do by hand on a small graph. This avoids using any "reverse" edges.

```
cycle         cost       flow      net cost
-------------------------------------------
0-1-2-4-5-0    -4           3           -12
0-2-4-5-0      -3           1            -3
0-3-4-5-0      -2           1            -2
0-1-3-5-0      -1           3            -3
0-2-5-0        -1           3            -3
-------------------------------------------
11           -23
```

18. The following array is a parent-link representation of a tree:

```
0  1  2  3  4  5  6  7  8  9
----------------------------
0  2  0  8  2  6  7  4  4  4
```
Give the parent-link representation of the tree that results from adding the edge 5-1 and removing the edge 2-4.

As in assignment 1, we need to reverse the parent-link path from 5 to 4.

```
0  1  2  3  4  5  6  7  8  9
----------------------------
0  2  0  8  7  1  5  6  4  4
```

19. Define the capacity of a node as the maximum incoming flow it can have. Briefly show how to reduce the problem of finding the maxflow in a network with capacity limits on the nodes to the usual maxflow problem. Answer in one or two sentences.

Replace the node v with two nodes v' and v''. Add an edge from v' to v'' with the desired node capacity. Replace all of the edges (u, v) that enter v to (u, v') so that they enter v'. Replace all of the edges (v, w) that leave v to (v'', w) so that they leave v''.

20. Match each of the given problems with its known worst-case complexity. Write the letter corresponding to your answer in the blank to the left of the name of the method. You must put only one letter per problem.

```C graph coloring             A. linear-time algorithm known

B linear programming         B. polynomial-time (superlinear) algorithm known

B convex hull                C. NP-hard

A graph planarity            D. unknown

B maximum flow

B minimum spanning tree

D graph isomorphism

A string matching
```