SUBSTRING SEARCH STUDY GUIDE
Terminology and Basics
 Substring search problem: Find an instance (or all instances) of a
query substring of length M in a text string of length N.
 Sustring must be precisely defined (no regular expressions).
 You should be able to manually use a KMP DFA, and you should be able to
manually carry out BoyersMoore and Rabin Karp.
KMP
 How do you construct the DFA?
 How much time does it take if you resimulate every time you have a mismatch?
It's ok if you don't fully understand the linear time construction process.
 What is the bestcase running time for DFA construction and DFA simulation?
The worstcase running time?
BoyerMoore
 What is the mismatched character heuristic? Why do we use the rightmost character?
 Why is the mismatched character heuristic strictly suboptimal?
Why do we use it then  because the basic idea is very similar to KMP
and you'll learn it if you ever really need to.
 What is the bestcase running time? The worstcase running time?
 Which inputs result in best and worst case performance?
Recommended Problems
C level
 Fall 2012, #10 (BoyerMoore)

(a) Given the following KMP DFA, give the string that this DFA searches for
j 
0 
1 
2 
3 
4 
5 
6 
A 
1 
1 
3 
1 
5 
1 
5 
B 
0 
2 
0 
4 
0 
6 
7 
(b) Below is a partiallycompleted KMP DFA for a string sof length 6 over the alphabet {a, B}. State 6 is the accept state. Fill in the missing spots in the table.
j 
0 
1 
2 
3 
4 
5 
pat.charAt(j) 






A 
1 
1 




B 


3 


3 
(c) Given each of the following strings as input, what state would the DFA in (a) end in?
BABAA
ABABABA
BABABABA
BBAABBABAB
Answers
(a) ABABABB
(b)
j 
0 
1 
2 
3 
4 
5 
pat.charAt(j) 
A 
B 
B 
A 
B 
A 
A 
1 
1 
1 
4 
1 
6 
B 
0 
2 
3 
0 
5 
3 
(c) 1,5,5,4
B level
 Fall 2011 Final, #6 (KMP)
 Spring 2012 Final, #7 (KMP)
 Fall 2012, #9 (KMP)
 Give an example of when you might want to use KMP? Boyer Moore? Rabin Karp?
A level

For each algorithm (the version discussed in lecture and the textbook),
give the worstcase order of growth in terms of M and N.
 bruteforce substring search for a query string of size M
in a text string of size N
 KnuthMorris Pratt substring search for a query string of
size M in a text string of size N
 BoyerMoore (with only mismatch heuristic) substring
search for a query string of size M in a text string of size
N
 simulating a DFA with M vertices and 2M edges on a text
string of size N
Answers
MN, N or M + N, MN, N
 Textbook: 5.3.22
 Textbook: 5.3.26