Wed Mar 28 09:58:56 EDT 2007

Here's some advice for assignment 5, since early returns from scattered
precincts suggest that there are pitfalls for the unwary.  No particular
order.  I will add things as they occur to me, so check back often.

-- You have to run get_all to make a reg.txt; the latter is not part
of the distribution.

-- Make sure that permissions in your campuscgi directory permit an
ordinary user to at least run your code and access your files; the
server is running as something like "none", not as you.

-- The campuscgi machine runs SunOS, not Linux; that's what running when
you run the script(s) via reg.html.  This means that what you managed to
get running when logged in to hats or arizona is not necessarily what's
running via the browser.  I got bitten by this myself by using a
search path in reg.cgi that was fine for Linux but not for Solaris.
(Watch out for this -- I've fixed the version on the web page.)

-- You do not have to use awk!  It's easy for some things but it is not
the most expressive language in the world and it has some surprising
behaviors.  I wrote my partial solution in Perl, based on example code
presented in class.  You could even write in Java, which appears to work
fine, and will be even more familiar.

-- Print statements are your friend!  Those residual echo statements in
reg.cgi and the commented-out prints in various scripts are examples.
When you're working in an unfamiliar language with unfamiliar tools,
verifying each step by printing input and output is much more effective
than beating your head against the wall.  Work your way through one line
at a time if necessary: print what came in and what went out, to see if
they are correct.  (That's how I ultimately figured out my search path
problems, though it took longer than it should have because I forgot
this cardinal principle.)

-- This is not meant to be a time-consuming exercise, nor is a lot of
code necessary if you think clearly and cut the right kinds of corners.
My Perl script is under 20 lines long aside from some static data
structures, and it's very mundane.  In hindsight I could have done it
nearly as easily in Awk, though the latter doesn't have an explicit
case-insensitive RE match.  An example of corner-cutting: the static
data structures were created by a text editor; there's no need to write
code to create them since they don't change.  I made a few simple
changes to get_all.awk but it's about the same size.  (And I didn't do a
fifth feature; that might add another few lines.)

-- Here's a useful awk feature mentioned in class, using FILENAME to
select what actions to perform on different input files:

 awk '
   FILENAME == "distcode.txt" { action done only on lines in distcode.txt }
   FILENAME == "reg.txt" { action done only on lines in reg.txt }
 ' distcode.txt reg.txt

This lets you use the implicit input loop rather than explicit getlines.

-- Another useful feature that might help pass in a query string to an
awk program:

  awk -v qs=$q1 -f whatever.awk reg.txt

The -v argument (of which there may be more than one) sets an awk variable
to a value before the awk program begins execution.