Princeton University
COS 217: Introduction to Programming Systems

Assignment 7: A Unix Shell


Purpose

The purpose of this assignment is to help you learn about Unix processes, low-level input/output, and signals. It also will give you ample opportunity to define software modules; in that sense the assignment is a capstone for the course.


Rules

This assignment is an individual assignment, not a team assignment.

Signal handling (as described below) is the "extra challenge" part of this assignment. While doing the "extra challenge" part of the assignment, you are bound to observe the course policies regarding assignment conduct as given in the course Policies web page, plus one additional policy: you may not use any "human" sources of information. That is, you may not consult with the course's staff members, the lab teaching assistants, other current students via Piazza, or any other people while working on the "extra challenge" part of an assignment, except for clarification of requirements.

The "extra challenge" part is worth 6 percent of this assignment. So if you don't do any of the "extra challenge" part and all other parts of your assignment solution are perfect and submitted on time, then your grade for the assignment will be 94 percent.


Background

A Unix shell is a program that makes the facilities of the operating system available to interactive users. There are several popular Unix shells: sh (the Bourne shell), csh (the C shell), and bash (the Bourne Again shell) are a few.


Your Task

Your task in this assignment is to create a series of three related programs. The programs must be named ishlex, ishsyn, and ish. Your ish program must be a minimal but realistic interactive Unix shell. Your development of the simpler ishlex and ishsyn programs will help you to develop your ish program. A Supplementary Information page lists detailed requirements and recommendations.


Stage 1: Lexical Analysis

Compose a lexical analyzer for your programs. Your lexical analyzer must be defined in a distinct module. Your lexical analyzer must accept an array of characters, and return a DynArray object containing tokens. (The DynArray ADT was described in precepts. The source code defining the DynArray ADT is available in the FC010 /u/cos217/Assignment7 directory.) Compose additional modules that are used by your lexical analyzer, as appropriate.

From the user's point of view, a token is a word. (Your program may represent a token using a richer data structure.) More formally, from the user's point of view a token consists of a sequence of non-white-space characters that is separated from other tokens by white-space characters. There are two exceptions:

Special characters inside of strings are not separate tokens. It is an error for an "opening" double quote within a line to be unmatched by a "closing" double quote.

Make no assumptions about the length of each line. Your lexical analyzer must work for lines of any length.

Then compose a client of your lexical analyzer. The client must be defined in a file named ishlex.c. Use the ishlex.c client, your lexical analyzer module, and other modules that you have composed to build a program named ishlex. Your ishlex must:

It must do that repeatedly until the program reaches end-of-file of stdin. Recall that typing Ctrl-d simulates end-of-file when stdin is bound to the terminal.

Test your ishlex thoroughly. These files, provided on FC010 in the /u/cos217/Assignment7 directory, will help you with your testing:


Stage 2: Syntactic Analysis

Compose a syntactic analyzer for your programs. Your syntactic analyzer must be defined in a distinct module. Your syntactic analyzer must accept a DynArray object containing tokens, and return a command. Compose additional modules as appropriate.

A command must begin with an ordinary token, which is the command's name. It is an error for a command not to begin with an ordinary token. The command name might be followed by command-line arguments, a redirection of stdin, and/or a redirection of stdout.

Your syntactic analyzer must handle redirection in these ways:

Then compose a client of your syntactic and lexical analyzer modules. The client must be defined in a file named ishsyn.c. Use the ishsyn.c client, your syntactic and lexical analyzer modules, and other modules that you have composed to build a program named ishsyn.

The behavior of your ishsyn must be a superset of the behavior of your ishlex, except that your ishsyn must not write tokens to stdout. More precisely, your ishsyn must:

It must do that repeatedly until the program reaches end-of-file of stdin.

Test your ishsyn thoroughly. These files, provided on FC010 in the /u/cos217/Assignment7 directory, will help you with your testing:


Stage 3: Handling Executable Binary Commands

Compose a "first draft" of ish. At this stage ish must handle simple executable binary commands, that is, executable binary commands that contain no redirection (via < or >).

Specifically, compose a file named ish.c. Use ish.c, your lexical and syntactic analyzer modules, and other modules that you have composed to build a program named ish. Compose additional modules as appropriate.

The behavior of your ish must be a superset of the behavior of your ishsyn, except that your ish must not write commands to stdout. More precisely, your ish must:

It must do that repeatedly until the program reaches end-of-file of stdin.

Test your ish thoroughly. These files, provided on FC010 in the /u/cos217/Assignment7 directory, will help you with your testing:


Stage 4: Handling Shell Built-In Commands

Enhance ish so it handles shell built-in commands. Specifically, ish must interpret four shell built-in commands:

setenv var [value] If environment variable var does not exist, then your ish must create it. Your ish must set the value of var to value, or to the empty string if value is omitted. Note: Initially, your ish inherits environment variables from its parent. Your ish must be able to modify the value of an existing environment variable or create a new environment variable via the setenv command. Your ish must be able to set the value of any environment variable; but the only environment variable that it explicitly uses is HOME. It is an error for a setenv command to have zero or more than two command-line arguments.
unsetenv var Your ish must destroy the environment variable var. It is an error for an unsetenv command to have zero command-line arguments or more than one command-line argument.
cd [dir] Your ish must change its working directory to dir, or to the HOME directory if dir is omitted. It is an error for a cd command to have more than one command-line argument. It is an error for a cd command to have zero command-line arguments if the HOME environment variable is not set.
exit Your ish must exit with status 0. It is an error for an exit command to have any command-line arguments.

Test your ish thoroughly. Your ish must have exactly the same behavior as sampleish does with respect to its handling of shell built-in commands. You will find the aforementioned testish and testishdiff scripts helpful.


Stage 5: Handling Redirection

Enhance your ish so it handles redirection of stdin and/or stdout.

It is erroneous for stdin to be redirected to a file that does not exist.

If stdout is redirected to a file that does not exist, then your ish must create it. If the stdout is redirected to a file that already exists, then your ish must destroy the file's contents and rewrite the file from scratch. Your ish must set the permissions of the file to 0600.

It is erroneous for stdout to be redirected to a file whose name is invalid. For example, it is erroneous for stdout to be redirected to a file named "/" or ".", or for stdout to be redirected to a file in some directory whose contents the user cannot change.

Note that the four shell built-in commands neither read from stdin nor write to stdout. So it would be pointless (but not erroneous) for the user to redirect stdin or stdout within any of those commands. More precisely, when given a shell built-in command containing redirection of stdin or stdout, your ish must lexically and syntactically analyze the entire command -- including the part that redirects stdin or stdout -- just as your ishlex and your ishsyn do -- and must report any lexical or syntactic errors that it encounters. However your ish must not implement the specified file redirection.

Test your ish thoroughly. Your ish must have exactly the same behavior as your sampleish does with respect to handling of redirection. You will find the aforementioned testish and testishdiff scripts helpful.


Stage 6: Handling Signals

Enhance your ish to handle SIGINT signals.

When the user types Ctrl-c, Linux sends a SIGINT signal to the parent process and its children. Upon receiving a SIGINT signal:

Test your ish thoroughly. Your ish must have exactly the same behavior as sampleish does with respect to handling of signals.


Error Handling

Your programs must handle each erroneous line gracefully by writing a descriptive error message to stderr and rejecting the line. Any error message written by your programs must begin with "programName: " where programName is argv[0], that is, the name of your program's executable binary file. Note that argv[0] typically will be ishlex, ishsyn, or ish, but need not be so.

The error messages written by your programs must be identical to those written by sampleishlex, sampleishsyn, and sampleish. However, if your programs read a line that contains multiple errors, then your programs can report any one of the errors -- not necessarily the same error as sampleishlex, sampleishsyn, and sampleish reports.

It must be impossible for the user's input to cause your programs to assert or generate a segmentation fault.


Memory Management

Your programs must contain no memory leaks. For every call of malloc or calloc, eventually there must be a corresponding call of free. More specifically, your programs must produce clean meminfo reports when the user terminates your programs by typing Ctrl-d. ish need not produce a clean meminfo report when the user terminates the program by issuing the exit command.


Logistics

Develop on FC010. Use emacs to create source code. Use make to automate the build process. Use gdb to debug.

Critique your programs using the splint tool. Each time splint generates a warning on your code, you must either (1) edit your code to eliminate the warning, or (2) explain your disagreement with the warning in your readme file.

Similarly, critique your programs using the critTer tool. Each time critTer generates a warning on your code, you must either (1) edit your code to eliminate the warning, or (2) explain your disagreement with the warning in your readme file.

Create a Makefile. The first dependency rule must build all three programs. That is, the first dependency rule in the Makefile must be

all: ishlex ishsyn ish
The Makefile must maintain object (.o) files to allow for partial builds, and encode the dependencies among the files that comprise your programs. As always, use the gcc217 command to build.

Create a readme file by copying the file /u/cos217/Assignment7/readme to your project directory, and editing the copy by replacing each area marked "<Complete this section.>" as appropriate.

One of the sections of the readme file requires you to list the authorized sources of information that you used to complete the assignment. Another section requires you to list the unauthorized sources of information that you used to complete the assignment. Your grader will not grade your submission unless you have completed those sections. To complete the "authorized sources" section of your readme file, copy the list of authorized sources given in the "Policies" web page to that section, and edit it as appropriate.

Submit your work electronically on FC010 using this command:

submit 7 readme Makefile ishlex.c ishsyn.c ish.c
submit 7 dynarray.h dynarray.c
submit 7 allOtherModuleFiles

Don't forget to submit both your .h files and your .c files.

Suggestion: To make sure that your submission is complete, use this approach... Create a temporary directory. Copy the files that comprise your submission to that directory. Build your programs in that directory to make sure that no files are missing. Delete from that directory all files that you do not wish to submit, for example, executable binary files and .o files. Finally submit all of the files in that directory by issuing the command submit 7 *.


Grading

To receive any credit for your ishlex, the program must build. To receive any credit for your ishsyn, the program must build. To receive any credit for your ish, the program must build.

We will grade your programs on quality from the user's and programmer's points of view. From the user's point of view, your programs have quality if they behave as they must. The correct behavior of your programs is defined by the previous sections of this assignment specification and by the given sampleishlex, sampleishsyn, and sampleish programs.

From the programmer's point of view, your code has quality if it is well styled and thereby easy to maintain. In part, good style is defined by the splint and critTer tools, and the rules given in The Practice of Programming (Kernighan and Pike) as summarized by the Rules of Programming Style document. The more course-specific style rules listed in the previous assignment specifications also apply. Proper function-level and file-level modularity will be a prominent part of your grade. To encourage good coding practices, we will deduct points if gcc217 generates warning messages.

We will grade your lexical analyzer by running your ishlex, running sampleishlex, and comparing the output using the diff command. We recommend that you test your lexical analyzer using that approach. The given testishlex and testishlexdiff scripts will help you to do that. Your ishlex must generate exactly the same output as sampleishlex does, except for the program names that appear at the beginnings of error messages.

Your ishsyn must use the same lexical analyzer as your ishlex does. We will grade your syntactic analyzer by running your ishsyn, running sampleishsyn, and comparing the output using diff. We recommend that you test your syntactic analyzer using that approach. The given testishsyn and testishsyndiff scripts will help you to do that. Your ishsyn must generate exactly the same output as sampleishsyn does, except for the program names that appear at the beginnings of error messages.

Your ish must use the same lexical and syntactic analyzers as your ishsyn does. We will grade your ish by running it, running sampleish, and comparing the output. We recommend that you test your ish using that approach. The given testish and testishdiff scripts will help you to do that. Your ish must generate exactly the same output as sampleish does, except for the program names that appear at the beginnings of error messages.

Remember that the Supplementary Information page lists detailed implementation requirements and recommendations.


This assignment was written by Robert M. Dondero, Jr.
with contributions by many other faculty members and students.