Princeton University
COS 217:  Introduction to Programming System

Precepts 23 and 24:  Ish Details

Purpose

Describe the details of each ish development stage, with an emphasis on the underlying UNIX system calls

Stage 0:  Preliminaries

See the Assignment 6 Development Stages handout

Stage 1:  Lexical Analysis

See the Assignment 6 Development Stages handout

Review the assignment statement under the heading "Lexical Analysis"

Recommendation:  Design your lexical analyzer as a deterministic finite state automaton (FSA)

Example of a lexical analyzer (for a different problem):  fsa.c

What it does

Reads stdin until EOF (perhaps simulated by cntrl-D)

Prints

List of all numbers read:  a "number" is a sequence of decimal digits

List of all words read:  a "word" is a sequence of uppercase or lowercase letters and underscores

How it works

Uses List ADT from early precepts -- see list.h and list.c

Uses a FSA

Input:  a string

Output:  A structure containing the first token (number or word) in that string

See handwritten state diagram

Algorithm:

for each line in stdin
for each token in line
call FSA to get token
store token in number list or word list
print number list and word list

Look at program, noting:

struct Token

isEndOfLine, isDigit, and isLetter functions

main function

getToken function

The implementation of the FSA

Note the need to "move backward one character" in anticipation of next call to getToken

Discussion of ish's lexical analyzer:

Ish's lexical analysis phase can be similar to the example

Consider using the List ADT from previous precepts

Beware:  These two commands are very different:

echo one > two
echo one ">" two

ish's lexical analyzer must capture the difference between > and ">"

Hint:  As shown in dfa.c, a token need not be a simple character string; it could be a struct, or even an ADT

Stage 2:  Syntactic Analysis (alias Parsing)

(See the Assignment 6 Development Stages handout)

Straightforward

Suggestion:  Detect as many syntactic errors as possible

Examples of syntactic errors that can/should be detected by syntactic analysis:

% < one
% echo one |
% echo one > myfile > myotherfile
% echo one >
% ls -al > myfile | sort | more
% ls -al | sort > myfile | more
% ls -al | sort | more < myfile

Stage 3:  Built-In Command Execution

(See the Assignment 6 Development Stages handout)

Straightforward

Suggestion:  Learn about chdir, setenv, and unsetenv system calls

Stage 4:  Executable Binary Command Execution

(See the Assignment 6 Development Stages handout)

Less straightforward

Suggestion:  Learn about the fork, exec, and wait system calls

Examples:

(Give live demonstrations where appropriate)

(1) hello.c

getpid system call

Returns id of process

(2) testexec.c

exec system call

Overlays current process with another

On failure, sets global errno variable (as do all system calls)

Note:  there are 6 exec system calls; dimensions:  

Command name absolute pathname or relative pathname

Command line arguments passed as an array, or as actual parameters

Environment specified as actual parameter, or inherited from original process

perror standard C function

Actual parameter should be program name

Examines global errno variable

Prints appropriate error message that includes program name

Suggestion:  call perror after a failed system call

(3) testfork.c

fork system call

Creates a child process -- a duplicate of the current process

Duplicate has its own stack, heap, data section, bss section, text section

Duplicate has a different process id

Duplicate has same program counter

Calling fflush standard C function after printf

Sketch two copies of program on board to emphasize multiprocessing

(4) testforkret.c

fork's return value is not necessarily a process id; in the child process it's zero

(5) testforkexit.c

Common structure of program that uses fork

In child process, must use _exit, and not exit

For explanation, see man page for fork 

(6) testforkloop.c

Multiprocessing context switching

(7) testforkexec.c

Common combination of fork and exec to spawn a child process, and then overlay it

Note sequencing problem

Solution...

(8) testforkexecwait.c

wait system call

Waits for child process to finish

If n child processes have been forked, can call wait n times to wait for all of them to finish

Stage 5:  Executable Binary Command Execution with execv or execl

(See the Assignment 6 Development Stages handout)

Relatively straightforward

Suggestions:  

Learn about the PATH environment variable

Learn about the strtok standard C library function

Learn about the access system call

Stage 6:  Executable Binary Command Execution with I/O Redirection

(See the Assignment 6 Development Stages handout)

Suggestion:  Learn about the dup (and dup2) system calls

Examples:

(9) testdupstdout.c

creat system call

Creates a new file

Rewrites file if it exists

Returns a file descriptor, or -1 if unsuccessful

(Really should check for error)

close system call

Breaks the connection between a filename and a file descriptor

Frees the file descriptor for use with another file

dup system call

Duplicates specified file descriptor into first unused element of file descriptor table

Can be used to programmatically redirect stdout

(10) testdupstdin.c

open system call

Opens a file for reading or writing

Returns a file descriptor, or -1 if unsuccessful

(Really should check for error)

dup system call to programmatically redirect stdin

Stage 7:  Pipeline Execution

(See the Assignment 6 Development Stages handout)

Suggestion:  Learn about the pipe system call, and how to use it in coordination with fork, exec, wait, and dup

Examples:

(11) testpipe.c

pipe system call

Creates a pipe through which processes (on the same machine) can communicate

See testpipe.pdf for an example trace

Creates a "producer" child process and a "consumer" child process, and a pipe through which the producer sends "somedata" to the consumer

Note the importance of closing the pipe file descriptors in the parent

Note that parent must sometimes wait for all children to finish

(12) testpipedup.c

pipe system call used in coordination with dup and close system calls

Producer child process writes to stdout; consumer child process reads from stdin

Stage 8:  Process Control

(See the Assignment 6 Development Stages handout)

Suggestion:  Learn about the signal system call

Examples:

(13) testsignal.c

Using the signal system call to install your own signal handler

(14) testsignalignore.c

Ignoring signals in a parent process, but not in a child

ish must do the same

Stage 9:  History (for extra credit)

(See the Assignment 6 Development Stages handout)

history built-in command writes to stdout

Might reasonably have stdout redirected

Might reasonably be part of a pipeline

So must execute history command in a child process (alias subshell)

Conclusion

For your testing and our grading:

Make sure your ish interprets lines of ~/.ishrc

Make sure your ish prints each line of ~/.ishrc immediately after reading it

If you don't, grader will need to manually type hundreds of commands to test your ish

Sadly, grade would be adversely affected

You may need to learn about additional UNIX system calls, none of which is conceptually difficult

See the man pages

See the Kernighan and Pike textbook

See http://www.opengroup.org/onlinepubs/7908799/xsh/unistd.h.html

Good luck!

Copyright © 2002 by Robert M. Dondero, Jr