Princeton University
COS 217: Introduction to Programming Systems

Assignment 5: UNIX Commands in IA-32 Assembly Language

Purpose

The purpose of this assignment is to help you learn about IA-32 architecture and assembly language programming. It also will give you the opportunity to learn more about the GNU/UNIX programming tools, especially bash, xemacs, gcc, and gdb for assembly language programs.

Background: wc

The UNIX operating system has a command named wc (word count). In its simplest form, wc reads characters from stdin until end-of-file, and prints to stdout a count of how many lines, words, and characters it has read. A word is a sequence of characters that is delimited by one or more whitespace characters.

Consider some examples. In the following, a space is shown as "s" and a newline character as "n".

If the file named proverb contains these characters:

Learningsissan
treasureswhichn
accompaniessitsn
ownerseverywhere.n
--sChinesesproverbn

then the command:

$ wc < proverb

prints this line to standard output:

  5 12 82

If the file proverb2 contains these characters:

Learningsissan
treasureswhichn
accompaniessitsn
ownerseverywhere.n
--sssChinesesproverb

(note that the last "line" does not end with a newline character) then the command:

$ wc < proverb2

prints this line to standard output:

 4 12 83

Background: sort

Another commonly used UNIX command is sort. In its simplest form, sort reads lines from stdin, sorts them into ascending (i.e. alphabetical, i.e. lexicographic) order, and prints them to stdout. For example, if the file proverb contains these lines:

Learning is a
treasure which
accompanies its
owner everywhere.
-- Chinese proverb

then the command

$ sort < proverb

prints these lines to standard output:

-- Chinese proverb
Learning is a
accompanies its
owner everywhere.
treasure which

Note that the special character '-' has an ASCII code that is less than the ASCII codes of all alphabetic characters, and so the line that begins with '-' appears first.  Also note that the uppercase characters have ASCII codes that are less than the ASCII codes of the lower case characters, and so the line that begins with 'L' appears before the line that begins with 'a'.

Your Task

Your task is to translate given C versions of wc and sort into IA-32 assembly language, as specified below. Your assembly language programs should have exactly the same behavior (i.e. should write exactly the same characters to stdout) as the given C programs.

mywc

The file mywc.c in the /u/cos217/Assignment5 directory contains a C program that implements the subset of the wc command described above.  Translate that program into assembly language, thus creating a file named mywc.s. It is acceptable to use global (i.e. bss section and data section resident) variables in mywc.s.  But we encourage you to use local (i.e. stack resident) variables instead.

mysort

The files mysort.c, quicksort.c, partition.c, and swap.c in the /u/cos217/Assignment5 directory contain a C program that implements the subset of the sort command described above.  Translate the code in the quicksort.c, partition.c, and swap.c files into assembly language, thus creating files named quicksort.s, partition.s, and swap.s. You need not translate the mysort.c file into assembly language.

Your source code files should be designed such that you can build the mysort program using mysort.c along with either quicksort.c or quicksort.s, either partition.c or partition.s, and either swap.c or swap.s.

Testing of mywc

Design a test plan for your mywc program. Your test plan should assume that the given C code is correct, and should focus on making sure that your assembly language code behaves exactly the same as the given C code does. Your test plan should include tests in three categories: (1) boundary condition testing, (2) logical path testing, and (3) stress testing.

Create text files to test your programs. Name each such file such that its prefix is "mywc" and its suffix is ".txt". The command "ls mywc*.txt" should display the names of all mywc test files, and only those files.

Describe your mywc test plan in your readme file. Your description should have this structure:

mywc boundary condition tests:

mywcXXX.txt:  Description of mywc boundary condition tests implemented by that file.
mywcYYY.txt:  Description of mywc boundary condition tests implemented by that file.
...

mywc logical path tests:

mywcXXX.txt:  Description of mywc logical path tests implemented by that file.
mywcYYY.txt:  Description of mywc logical path tests implemented by that file.
...

mywc stress tests:

mywcXXX.txt:  Description of mywc stress tests implemented by that file.
mywcYYY.txt:  Description of mywc stress tests implemented by that file.
...

Finally, create a UNIX shell script named testmywc to automate your mywc test plan. A UNIX shell script is simply a text file that contains UNIX commands, and that has been made executable via the chmod command, for example, "chmod 700 testmywc".

The testmywc script should build and execute your mywc program. The script should compare the output of the program built using the given C code vs. the output of program built using your assembly language code.

It is acceptable for your testmywc script to call other scripts that you create. Each such script should have a name that is prefixed with "testmywc". The command "ls testmywc*" should display the names of all mywc test scripts, only those scripts. You may find the grade1 and grade1diff scripts from Assignment 1 to be useful models.

Testing of mysort

Similarly, design a test plan for your mysort program. Your test plan should assume that the given C code is correct, and should focus on making sure that your assembly language code behaves exactly the same as the given C code does. Your test plan should include tests in three categories: (1) boundary condition testing, (2) logical path testing, and (3) stress testing.

Create text files to test your programs. Name each such file such that its prefix is "mysort" and its suffix is ".txt". The command "ls mysort*.txt" should display the names of all mysort test files, and only those files.

Describe your test plan in your readme file. Your description should have this structure:

mysort boundary condition tests:

mysortXXX.txt:  Description of mysort boundary condition tests implemented by that file.
mysortYYY.txt:  Description of mysort boundary condition tests implemented by that file.
...

mysort logical path tests:

mysortXXX.txt:  Description of mysort logical path tests implemented by that file.
mysortYYY.txt:  Description of mysort logical path tests implemented by that file.
...

mysort stress tests:

mysortXXX.txt:  Description of mysort stress tests implemented by that file.
mysortYYY.txt:  Description of mysort stress tests implemented by that file.
...

Finally, create a UNIX shell script named testmysort to automate your mysort test plan. The script should build and execute mysort programs. It should compare the output of the program built using the given C code vs. the output of programs built using your assembly language code. It should compare the output of at least these combinations:

It is acceptable for your testmysort script to call other scripts that you create. Each such script should have a name that is prefixed with "testmysort". The command "ls testmysort*" should display the names of all mysort test scripts, and only those scripts. Again, you may find the grade1 and grade1diff scripts from Assignment 1 to be useful models.

It is acceptable for your testmysort script to call "make"

Logistics

You should develop on hats. Use xemacs to create source code. Use gdb to debug.

You should not use a C compiler to produce your assembly language programs, or any parts of them. Doing so would be considered an instance of academic dishonesty. Instead you should produce your assembly language programs manually.

We encourage you to develop "flattened" C code (as described in precepts) to bridge the gap between the given "normal" C code and your assembly language code. Using flattened C code as a bridge can eliminate logic errors from your assembly language code, leaving only the possibility of translation errors.

We also encourage you to use your flattened C code as comments in your assembly language code. Such comments can clarify your assembly language code substantially.

You should submit:

Your readme file should contain:

Submit your work electronically via the commands:

/u/cos217/bin/i686/submit 5 mywc.s quicksort.s partition.s swap.s 
/u/cos217/bin/i686/submit 5 readme testmywc* mywc*.txt testmysort* mysort*.txt

Grading

As always, we will grade your work on quality from the user's and programmer's points of view. To encourage good coding practices, we will take off points based on warning messages during compilation.

Comments in your assembly language programs are especially important. Each assembly language function -- especially the main() function -- should have a comment that describes what the function does. Local comments within your assembly language functions are equally important. Comments copied from corresponding "flattened" C code are particularly helpful.

Testing is a substantial aspect of the assignment. Approximately 20% of the grade will be based upon your test plan as described in your readme file, and as implemented by your test scripts and data files.