COS 126 Lecture 14: Pattern Matching

NOTE: See also Notes on Formal Language

grep

Most common use is for finding a particular word or short string in a file. This is analogous to searching for a Web document via Yahoo. The syntax is:

grep expression filename(s)

For example, you might want to find all the lines of code in your C files which refer to a particular variable.

grep 'count' *.c

How can you learn about grep?

1. TRY IT!!

2. there is a link to documentation with many examples accessible via the 126 FAQ list

3. type (on a Unix machine):

man grep

You can copy the manual page for grep to a file using '>' and then print that file. You should learn some of the main options for grep.

Here's some hints and highlights about grep:

[a3][c5] => a or 3 followed by c or 5, so valid matches are ac,a5,3c,35
[a-z] => any lowercase character between a and z
[^ a b c] => matches any lines in which at least one character is not 
a, b, or c
'^a' => matches any lines which start with an 'a'
'^c.b' => matches lines that start with a c and have b as the third character
'.' => matches any lines that have any characters
'^c.*b' => matches lines that start with c and have at least one b
'c.*b' => matches any lines that have a c followed by a b (not necessarily adjacent)
'c+b' => matches lines with cb, ccb, cccb, ccccb, etc. as substrings
'c.+b' => matches lines with cab, caab, cadb, etc. but not cb
'c.?b' => matches cb and cab but not caab (and other strings...)
'c.*b$' => matches any lines that have a c and end with a b
'^c.*b$' => matches any lines that start with c and end with b
x | y => matches lines with x or y
c{3} => matches lines with 3 consecutive c's
(bc){2} => matches lines with the pattern bcbc


awk

Try it out for yourself or read about it in a UNIX reference guide.