Assignment 2 - Generating WWW Listings

Generating WWW Listings

Write c2html, a program that generates Hypertext to render a listing of its file arguments suitable for browsing with a Wide World Web client, such as netscape. For example, the command

% c2html -tTitle infile1.c infile2.c infile3.c > outfile.html

will create a file called outfile.html which is an html document with title "Title" and containing the list of the code for the input files infile1.c, infile2.c, and infile3.c. An executable version of c2html is available in /u/cs217/Assignment2/c2html. A sample output is available in /u/cs217/Assignment2/outSample.html. You can browse the copy of the file here.

WWW documents are written in `HTML' - the HyperText Markup Language. You do not need to know much about HTML; simply mimic the output of the program /u/cs217/Assignment2/c2html. The source code for program is 106 lines long.

 A HTML document contains text and embedded formatting commands. Most formatting commands have the form <X>text</X> where X is the formatting command that applies to text. Commands that do not apply to specific pieces of text have only the leading <X>.

 The command like the one shown above generates an HTML file that has the following general structure.

<header>
<title>title</title>
</header>
<body>
<h1>title</h1>
 
<h2><a name="contents">Contents</a></h2>
 
<ul>
<li><a href="#1">file1</a>
...
<li><a href="#N">fileN</a>
</ul>

<hr>
 
<h2><a name="1">file1</a></h2>
<pre>...listing for file1...
</pre>
<a href="#contents">Goto the Contents</a>
<p><hr>
...
<h2><a name="N">fileN</a></h2>
<pre>...listing for fileN...
</pre>
<a href="#contents">Goto the Contents</a>
 
<p><hr>
<address>
date
</address>
</body>
title is the title string, which is specified by the -t option, e.g., -tSampleTitle. If this option is omitted, the lines <title>title</title> and <h1>title</h1> do not appear in the output. file1 through fileN are the file names in the order they appear on the command line. date is the date and time the command is executed as printed by the date command, e.g.,

Fri Feb 9 11:30:59 2001

 Each file is listed verbatim, except for some include directives. If an include directive refers to one of the ANSI Standard include files in brackets, a hypertext link to the include file itself is planted in the output. For example, if one of the input files contains the line

 #include <stdlib.h>

 c2html prints this line as

 #include &lt;<a href="http://www.princeton.edu/~cs217/include/stdlib.h">stdlib.h</a>&gt;

 Note the use of `&lt;' and `&gt;' for the literal occurrences of < and >; c2html must replace all left and right brackets with &lt; and &gt; to avoid confusing WWW clients that interpret HTML formatting codes. It must also replace occurrences of `&' with &amp;.

 Similarly, if an include directive refers to one of the file arguments, a self-referential hypertext link to that file is emitted. For example, if one input files contains something like the line

 #include "somefile.h"

 c2html prints this line as

 #include "<a href="#i">somefile.h</a>"

 The "#i" is the same string that appears just after the <h2> code at the beginning of the listing for somefile.h.

 If an include directive refers to file that is neither one of the ANSI Standard include files nor one of the ones listed in the call to c2html, then it is printed verbatim, it means, without a hypertext link.

Make sure that your program will detect "#include" and the following filename properly, even if there are extra spaces in the line. That is, the following line should not be listed verbatim, but receive corresponding treatment for the include directive.

__________#include_________"somefile.h"

where underscore means several spaces.

As diagnostic information, when any file(s) listed in the call to c2html do not exist, or cannot be opened correctly, the standard error output should be:

Error when openning ******

But your program must not crash on such situation, instead, write the same sentence as above at the place(s) where listing for the file(s) should be.

There is no guarantee that lines of input files are not longer than 80 characters.  To make it simple, we have the following special assumption for #include lines: There may be more than 80 spaces in the #include lines, but assume that total number of characters excluding spaces in the #include lines won't be more than 80.

Submission

Submit your program, a makefile, and a readme. You can look at the Makefile provided, or write your own. The readme file is a brief description of your program; it should include the program's input, output, author, and modification history. Submit your program electronically with the command
/u/cs217/bin/submit 2 readme makefile ...
You may use other computing facilities to develop your program, but the submitted version must compile with lcc -A and execute correctly on Arizona. To encourage good coding practices, you will lose points for any warning messages generated by lcc during the compilation of your program.
 
Due: submitted by 11:59pm, Monday, February 19, 2001.