Princeton University
COS 217: Introduction to Programming Systems

Assignment 7: A Unix Shell


Purpose

The purpose of this assignment is to help you learn about Unix processes, low-level input/output, and signals. It also will give you ample opportunity to define software modules; in that sense the assignment is a capstone for the course.


Rules

This assignment is an individual assignment, not a team assignment.

Signal handling (as described below) is the "extra challenge" part of this assignment. While doing the "extra challenge" part of the assignment, you are bound to observe the course policies regarding assignment conduct as given in the course Policies web page, plus one additional policy: you may not use any "human" sources of information. That is, you may not consult with the course's staff members, the lab teaching assistants, other current students via Piazza, or any other people while working on the "extra challenge" part of an assignment, except for clarification of requirements.

The "extra challenge" part is worth 8 percent of this assignment. So if you don't do any of the "extra challenge" part and all other parts of your assignment solution are perfect and submitted on time, then your grade for the assignment will be 92 percent.


Background

A Unix shell is a program that makes the facilities of the operating system available to interactive users. There are several popular Unix shells: sh (the Bourne shell), csh (the C shell), and bash (the Bourne Again shell) are a few.


Your Task

Your task in this assignment is to create a program named ish. Your program must be a minimal but realistic interactive Unix shell. A Supplementary Information page lists detailed requirements and recommendations.


Functionality

When first started, your program must check to see if the user provided a command-line argument. If so, then your program must treat the command-line argument as a file name, and repeatedly must:

until the program reaches end-of-file. Thus that mechanism provides a convenient way for the user to automate the execution of configuration commands, and for you to automate the testing of your program. Thereafter your program repeatedly must:

Your program must exit when the user types Ctrl-d or issues the exit command. (See also the section below entitled "Signal Handling.")


Lexical Analysis

From the user's point of view, a token must be a word. (Your program may represent a token using a richer data structure.) More formally, from the user's point of view a token must consist of a sequence of non-white-space characters that is separated from other tokens by white-space characters. There must be two exceptions:

Your program must make no assumptions about the length of each line that it reads. Your program must work for lines of any length.


Syntactic Analysis

From the user's point of view, a command must be a sequence of tokens, the first of which specifies the command name. (Your program may represent a command using a richer data structure.)

The '<' token must indicate that the following token is the name of a file. Your program must redirect the command's stdin to that file. It must be an error to redirect a command's stdin more than once.

The '>' token must indicate that the following token is the name of a file. Your program must redirect the command's stdout to that file. It must be an error to redirect a command's stdout more than once.


Execution

Your program must interpret four shell built-in commands:

setenv var [value] If environment variable var does not exist, then your program must create it. Your program must set the value of var to value, or to the empty string if value is omitted. Note: Initially, your program inherits environment variables from its parent. Your program must be able to modify the value of an existing environment variable or create a new environment variable via the setenv command. Your program must be able to set the value of any environment variable; but the only environment variable that it explicitly uses is HOME.
unsetenv var Your program must destroy the environment variable var.
cd [dir] Your program must change its working directory to dir, or to the HOME directory if dir is omitted.
exit Your program must exit with exit status 0. Note: exit takes no command-line arguments. It is an error for the user to execute the exit command with command-line arguments.

Note that those shell built-in commands neither read from stdin nor write to stdout. So it would be pointless (but not erroneous) for the user to redirect stdin or stdout within any of those commands.

More precisely, when given a shell built-in command containing redirection of stdin or stdout, your program must lexically and syntactically analyze the entire command -- including the part that redirects stdin or stdout -- and must report any errors that occur anywhere within the command. However your program must not implement the specified file redirection.

If the command is not a shell built-in command, then your program must consider the command name to be the name of a file that contains code to be executed. Your program must fork a child process and pass the file name, along with its arguments, to the execvp system call. If the attempt to execute the file fails, then your program must write an error message indicating the reason for the failure.

If stdin is redirected to a file that does not exist, then your program must write an appropriate error message.

If stdout is redirected to a file whose name is somehow invalid (for example, if stdout is redirected to a file named "/" or ".", or if stdout is redirected to a file in some directory whose contents the user cannot change), then your program must write an appropriate error message. Otherwise, if stdout is redirected to a file that does not exist, then your program must create it. If the stdout is redirected to a file that already exists, then your program must destroy the file's contents and rewrite the file from scratch. Your program must set the permissions of the file to 0600.


Process Handling

All child processes forked by your program must run in the foreground. Your program need not support background processes.


Signal Handling

When the user types Ctrl-c, Linux sends a SIGINT signal to the parent process and its children. Upon receiving a SIGINT signal:

When the user types Ctrl-\, Linux sends a SIGQUIT signal to the parent process and its children. Upon receiving a SIGQUIT signal:


Error Handling

Your program must handle an erroneous line gracefully by writing a descriptive error message to stderr and rejecting the line. An error message written by your program must begin with "programName: " where programName is argv[0], that is, the name of your program's executable binary file. Note that argv[0] typically will be ish, but need not be so.

The error messages written by your program need not be identical to those written by the given sampleish program. However, the error messages written by your program must be at least as descriptive as those written by sampleish.

Your program must handle all user errors. It must be impossible for the user's input to cause your program to crash.


Memory Management

Your program must contain no memory leaks. For every call of malloc or calloc, eventually there must be a corresponding call of free. More specifically, your program must produce a clean meminfo report when the user terminates your program by typing Ctrl-d. It need not produce a clean meminfo report when the user terminates your program by issuing the exit command or by typing Ctrl-\ twice within 5 seconds.


Testing

Test your program by creating multiple files containing lines which your program must interpret, and by executing your program with each file as the lone command-line argument. The file /u/cos217/Assignment7/ishconfig contains a sequence of commands that can serve as a minimal test case. You must develop many more test files, or must add many more commands to the given test file.

Of course you also must test your program manually by typing commands at its prompt.


Logistics

Develop on nobel. Use emacs to create source code. Use make to automate the build process. Use gdb to debug.

An executable version of the assignment solution is available in /u/cos217/Assignment7/sampleish. Use it to resolve any issues concerning the desired functionality of your program. The /u/cos217/Assignment7 directory also contains the interface and implementation of the DynArray ADT that we discussed in precepts. You are welcome to use that ADT in your program.

Critique your code using the splint tool. Each time splint generates a warning on your code, you must either (1) edit your code to eliminate the warning, or (2) explain your disagreement with the warning in your readme file.

Similarly, critique your code using the critTer tool. Each time critTer generates a warning on your code, you must either (1) edit your code to eliminate the warning, or (2) explain your disagreement with the warning in your readme file.

Create a Makefile. The first dependency rule must build your entire program. The Makefile must maintain object (.o) files to allow for partial builds, and encode the dependencies among the files that comprise your program. As always, use the gcc217 command to build.

Create a readme file by copying the file /u/cos217/Assignment7/readme to your project directory, and editing the copy by replacing each area marked "<Complete this section.>" as appropriate.

One of the sections of the readme file requires you to list the authorized sources of information that you used to complete the assignment. Another section requires you to list the unauthorized sources of information that you used to complete the assignment. Your grader will not grade your submission unless you have completed those sections. To complete the "authorized sources" section of your readme file, copy the list of authorized sources given in the "Policies" web page to that section, and edit it as appropriate.

Submit your work electronically on nobel using this command:

submit 7 readme Makefile allsourcecodefiles

Don't forget to submit your .h files. If you use the DynArray ADT from precepts, then submit the dynarray.h and dynarray.c files.


Grading

The minimal requirement to receive credit for the assignment is that the program must build.

We will grade your work on quality from the user's and programmer's points of view. From the user's point of view, your program has quality if it behaves as it must. The correct behavior of your program is defined by the previous sections of this assignment specification and by the given sampleish program. Remember to use meminfo to help you make sure that all dynamically allocated memory is freed.

From the programmer's point of view, your code has quality if it is well styled and thereby easy to maintain. In part, good style is defined by the splint and critTer tools, and the rules given in The Practice of Programming (Kernighan and Pike) as summarized by the Rules of Programming Style document. The more course-specific style rules listed in the previous assignment specifications also apply. Proper function-level and file-level modularity will be a prominent part of your grade. To encourage good coding practices, we will deduct points if gcc217 generates warning messages.

If your program works only through the lexical analysis phase, then leave code in your program that writes the token array that the lexical analysis phase creates. Doing so will enable us to assign partial credit for your successful implementation of lexical analysis.

Similarly, if your program works only through the syntactic analysis phase, then leave code in your program that writes the command that the syntactic analysis phase creates. Doing so will enable us to assign partial credit for your successful implementation of syntactic analysis.

Remember that the Supplementary Information page lists detailed implementation requirements and recommendations.


This assignment was written by Robert M. Dondero, Jr. with contributions by many other faculty members.