Princeton University
COS 217: Introduction to Programming Systems

Assignment 7: A Unix Shell


Purpose

The purpose of this assignment is to help you learn about Unix processes, low-level input/output, and signals. It also will give you ample opportunity to define software modules; in that sense the assignment is a capstone for the course.


Rules

This assignment is an individual assignment, not a teams-of-two assignment.

Signal handling (as described below) is the "on your own" part of this assignment. That part is worth 8% of this assignment.


Background

A Unix shell is a program that makes the facilities of the operating system available to interactive users. There are several popular Unix shells: sh (the Bourne shell), csh (the C shell), and bash (the Bourne Again shell) are a few.


Your Task

Your task in this assignment is to create a program named ish. Your program should be a minimal but realistic interactive Unix shell. A Supplementary Information page lists detailed implementation requirements and recommendations.


Initialization and Termination

When first started, your program should read and interpret lines from a configuration file, provided that the file exists and is readable. The file name may be given as a command-line argument to the shell, or if (and only if) not supplied on the command line, default to .ishrc in the user's HOME directory.

If a file name is supplied as a command-line argument, you may assume that the file's path will not be longer than 1023 characters. If the argument is longer than 1023 characters, then your program should treat it as unreadable (that is, proceed to interactive operation), and should not corrupt memory or exit.

Note that the name of the default configuration file is .ishrc (not ishrc, .ISHRC, etc.) and that it resides in the user's HOME directory, not the current (alias working) directory.

To facilitate your debugging and our testing, your program should print each line that it reads from the configuration file immediately after reading it. Your program should print a percent sign and a space (% ) before each such line.

Your program should terminate when the user types Ctrl-d or issues the exit command. (See also the section below entitled "Signal Handling.")


Interactive Operation

After start-up processing, your program repeatedly should perform these actions:


Lexical Analysis

From the user's point of view, a token should be a word. (Your program may represent a token using a richer data structure.) More formally, from the user's point of view a token should consist of a sequence of non-white-space characters that is separated from other tokens by white-space characters. There should be two exceptions:

Your program should assume that no line of the standard input stream contains more than 1023 characters; the terminating newline character is included in that count. In other words, your program should assume that a string composed from a line of input can fit in an array of characters of length 1024. If a line of the standard input stream is longer than 1023 characters, then your program need not handle it properly; but it should not corrupt memory.


Syntactic Analysis

From the user's point of view, a command should be a sequence of tokens, the first of which specifies the command name. (Your program may represent a command using a richer data structure.)

The '<' token should indicate that the following token is the name of a file. Your program should redirect the command's standard input stream to that file. It should be an error to redirect a command's standard input stream more than once.

The '>' token should indicate that the following token is the name of a file. Your program should redirect the command's standard output stream to that file. It should be an error to redirect a command's standard output stream more than once.


Execution

Your program should interpret four shell built-in commands:

setenv var [value] If environment variable var does not exist, then your program should create it. Your program should set the value of var to value, or to the empty string if value is omitted. Note: Initially, your program inherits environment variables from its parent. Your program should be able to modify the value of an existing environment variable or create a new environment variable via the setenv command. Your program should be able to set the value of any environment variable; but the only environment variable that it explicitly uses is HOME.
unsetenv var Your program should destroy the environment variable var.
cd [dir] Your program should change its working directory to dir, or to the HOME directory if dir is omitted.
exit Your program should exit with exit status 0. Note: exit takes no command-line arguments.

Note that those built-in commands should neither read from the standard input stream nor write to the standard output stream. So your program should ignore file redirection with those built-in commands. Your program should still recognize invalid commands, however.

If the command is not a built-in command, then your program should consider the command name to be the name of a file that contains code to be executed. Your program should fork a child process and pass the file name, along with its arguments, to the execvp system call. If the attempt to execute the file fails, then your program should print an error message indicating the reason for the failure.

If the standard input stream is redirected to a file that does not exist, then your program should print an appropriate error message.

If the standard output stream is redirected to a file that does not exist, then your program should create it. If the standard output stream is redirected to a file that already exists, then your program should destroy the file's contents and rewrite the file from scratch. Your program should set the permissions of the file to 0600.


Process Handling

All child processes forked by your program should run in the foreground. Your program need not support background processes.


Signal Handling

When the user types Ctrl-c, Linux sends a SIGINT signal to the parent process and its children. Upon receiving a SIGINT signal:

When the user types Ctrl-\, Linux sends a SIGQUIT signal to the parent process and its children. Upon receiving a SIGQUIT signal:


Error Handling

Your program should handle an erroneous line gracefully by rejecting the line and writing a descriptive error message to the standard error stream. An error message written by your program should begin with "programName: " where programName is argv[0], that is, the name of your program's executable binary file. Note that argv[0] typically will be ish, but need not be so.

The error messages written by your program need not be identical to those written by the given sampleish program. However, the error messages written by your program should be should be at least as descriptive as those written by sampleish.

Your program should handle all user errors. It should be impossible for the user's input to cause your program to crash.


Memory Management

Your program should contain no memory leaks. For every call of malloc or calloc, eventually there should be a corresponding call of free. More specifically, your program should produce a clean Meminfo report when the user terminates your program by typing Ctrl-d. It need not produce a clean Meminfo report when the user terminates your program by issuing the exit command or by typing Ctrl-\ twice within 5 seconds.


Testing

Test your program by creating multiple files containing lines which your program should interpret, repeatedly copying each to the .ishrc file in your HOME directory, and restarting your program. Another way to test is to repeatedly restart your program with each different file as its lone command-line argument. The file /u/cos217/Assignment7/.ishrc contains a sequence of commands that can serve as a minimal test case. You should develop many more test files.

Of course you also should test your program manually by typing commands at its prompt.


Logistics

Develop on hats. Use emacs to create source code. Use make to automate the build process. Use gdb to debug.

An executable version of the assignment solution is available in /u/cos217/Assignment7/sampleish. Use it to resolve any issues concerning the desired functionality of your program. The /u/cos217/Assignment7 directory also contains the interface and implementation of the DynArray ADT that we discussed in precepts. You are welcome to use that ADT in your program.

You should submit:

Your readme file should contain:

Submit your work electronically via the command:

submit 7 readme makefile allsourcecodefiles

If you use the DynArray ADT from precepts, then submit the dynarray.h and dynarray.c files.


Grading

We will grade your work on quality from the user's point of view and from the programmer's point of view. From the user's point of view, your program has quality if it behaves as it should. The correct behavior of your program is defined by the previous sections of this assignment specification and by the given sampleish program.

From the programmer's point of view, your program has quality if it is well styled and thereby simple to maintain. See the specifications of previous assignments for guidelines concerning style. Proper function-level and file-level modularity will be a prominent part of your grade. To encourage good coding practices, we will deduct points if gcc217 generates warning messages. We also will deduct points if splint generates warning messages that are not explained in your readme file.

If your program works only through the lexical analysis phase, then leave code in your program that prints the token array that the lexical analysis phase creates. Doing so will enable us to assign partial credit for your successful implementation of lexical analysis. Similarly, if your program works only through the syntactic analysis phase, then leave code in your program that prints the command that the syntactic analysis phase creates. Doing so will enable us to assign partial credit for your successful implementation of syntactic analysis.

Remember that the Supplementary Information page lists detailed implementation requirements and recommendations.