Princeton University
COS 217: Introduction to Programming Systems

Assignment 6: A Heap Manager Module


Purpose

The purpose of this assignment is to help you understand how heap memory management works in C. It also will give you more opportunity to use the GNU/Linux programming tools, especially bash, emacs, gcc217, gdb, and make.

Students from past semesters reported taking, on average, 14 hours (each) to complete this assignment.


Rules

You may work with one teammate on this assignment. You need not work with a teammate, but we prefer that you do. Your preceptor will help you with your teammate search, if you so desire.

Your teammate must be from your precept. Sorry, no exceptions. So make sure you find a teammate as soon as you can. After all, if your precept has an odd number of students, then someone must work alone; don't let that person be you!

The "challenge" part for this assignment is detailed below. While doing the challenge part of the assignment, you are bound to observe the course policies regarding assignment conduct as given in the course Policies web page, plus one additional policy: you may not use any "human" sources of information. That is, you may not consult with the course's staff members, the lab teaching assistants, other current students via Piazza, or any other people while working on the challenge part of an assignment, except for clarification of requirements. However you may consult with your teammate, with no limitations.

The challenge part is worth 6 percent of the assignment. So if you don't do any of the challenge part and all other parts of your assignment solution are perfect and submitted on time, then your grade for the assignment will be 94 percent.


Background

A standard C programming environment contains four functions that allow dynamic memory management: malloc, free, calloc, and realloc. Those functions are used heavily in many C programs.

The COS 217 lectures describes multiple implementations of malloc and free, five of which use the heap section of memory:

Some background reading:


Your Tasks

You are given heapmgr.h, the interface of a HeapMgr (heap manager) module. It declares two functions:

void *HeapMgr_malloc(size_t uSize);
void  HeapMgr_free(void *pv);

You also are given the heapmgr1.c, heapmgr2.c, and heapmgr3.c implementations of the HeapMgr module. Your task is to compose heapmgr4.c, heapmgr5.c, and some supporting files.

Your heapmgr4.c must enhance heapmgr3.c so the HeapMgr_free function is efficient. To do that it must use a doubly-linked list and chunks that contain headers and footers (as described above, in lectures, and in precepts).

If designed properly, the HeapMgr_free function in your heapmgr4.c implementation will be efficient in all cases. However, the HeapMgr_malloc function in your heapmgr4.c implementation will be subject to poor worst-case behavior. Your heapmgr5.c implementation must enhance your heapmgr4.c implementation so the worst-case behavior of its HeapMgr_malloc function is not poor. To do that it must use multiple doubly-linked lists, alias bins (as described above, in lectures, and in precepts).

Your heapmgr4.c and heapmgr5,c implementations must not call the standard malloc, free, calloc, or realloc functions.

Your heapmgr4.c and heapmgr5.c implementations must check invariants by defining a thorough Checker_isValid function, and calling assert(Checker_isValid(...)) at the leading and trailing edges of the HeapMgr_malloc and HeapMgr_free functions.


The Given Files

The CourseLab directory /u/cos217/Assignment6 contains files that support the assignment. For reference, we list and briefly describe them here. The following sections of this document describe how to use them.


The testheapmgr Client

The testheapmgr.c client requires three command-line arguments. The first must be any one of seven strings, as shown in the following table, indicating which of seven tests the client should run:

Argument Test Performed
LifoFixed LIFO with fixed size chunks
FifoFixed FIFO with fixed size chunks
LifoRandom LIFO with random size chunks
FifoRandom FIFO with random size chunks
RandomFixed Random order with fixed size chunks
RandomRandom Random order with random size chunks
Worst Worst case order for a heap manager implemented using a single linked list

The second command-line argument is the number of calls of HeapMgr_malloc and HeapMgr_free that the client should execute. The third command-line argument is the (maximum) size, in bytes, of each memory chunk that the client should allocate and free.

Immediately before termination testheapmgr.c writes to stdout an indication of how much CPU time and heap memory it consumed. See the testheapmgr.c file for more information.


The Procedure

Develop on CourseLab, using emacs to create source code and gdb to debug. Perform these steps:


Step 0: Create a Project Directory and Study the Given Makefile

The CourseLab /u/cos217/Assignment6 directory contains the given files that are described previously in this document. Create a project directory, and copy all files except pride32.txt from the /u/cos217/Assignment6 directory to your project directory.

Then study the given Makefile. It is not a proper one: it does not maintain .o files to allow for partial builds, but instead builds directly from .c files (or given .o files) to executable binary files. Nevertheless, you will find it helpful. Using it will allow you to avoid much typing.


Step 1: Compose checker4.c

Compose your checker4.c implementation. Then issue these commands to build programs with buggy heapmgr4 implementations:

gcc217 -g testheapmgr.c heapmgr4bada.o checker4.c chunk4.c -o test4bada
gcc217 -g testheapmgr.c heapmgr4badb.o checker4.c chunk4.c -o test4badb
gcc217 -g testheapmgr.c heapmgr4badc.o checker4.c chunk4.c -o test4badc
gcc217 -g testheapmgr.c heapmgr4badd.o checker4.c chunk4.c -o test4badd
gcc217 -g testheapmgr.c heapmgr4bade.o checker4.c chunk4.c -o test4bade
gcc217 -g testheapmgr.c heapmgr4badf.o checker4.c chunk4.c -o test4badf
gcc217 -g testheapmgr.c heapmgr4badg.o checker4.c chunk4.c -o test4badg

Then run those programs using these commands:

./test4bada RandomRandom 1000 20000
./test4badb RandomRandom 1000 20000
./test4badc RandomRandom 1000 20000
./test4badd RandomRandom 1000 20000
./test4bade RandomRandom 1000 20000
./test4badf RandomRandom 1000 20000
./test4badg RandomRandom 1000 20000

If (and only if) each program writes a reasonable descriptive error message and aborts via an assert, then your checker4.c implementation is working properly.

It will be challenging to compose a checker4.c implementation that detects the errors in all of the given heapmgr4bad*.o files. But your checker4.c implementation should detect the errors in many of the given heapmgr4bad*.o files before you proceed to the next step.


Step 2: Compose heapmgr4.c

Compose your heapmgr4.c implementation. Then issue this command:

gcc217 -g testheapmgr.c heapmgr4.c checker4.c chunk4.c -o test4d

to build a debug version of a testing program. Run the program with various command-line arguments to make sure your heapmgr4.c implementation passes all tests defined in your checker4.c implementation.

Then issue these commands:

gcc217 -D NDEBUG -O testheapmgr.c heapmgr4.c chunk4.c -o test4
gcc217 -D NDEBUG -O testheapmgr.c heapmgr4good.o chunk4.c -o test4good
The -O (that's uppercase "oh") argument commands gcc to optimize the machine language code that it produces. When given the -O argument, gcc spends more time compiling your code so, subsequently, the computer spends less time executing your code. The -D NDEBUG argument commands gcc to define the NDEBUG macro, just as if the preprocessor directive #define NDEBUG appeared in the specified .c file(s). Defining the NDEBUG macro disables the calls of the assert macro within the HeapMgr implementations. Doing so also disables code within testheapmgr.c that performs (time consuming) checks of memory contents.

Those commands build a release version of a testing program using your heapmgr4.c implementation and a release version of a testing program using the given heapmgr4good.o implementation. Run the two programs with various command-line arguments. If your test4 program consumes exactly the same amount of memory as the test4good program does, and the two sometimes consume approximately the same amount of time, then you can be reasonably sure that your heapmgr4.c implementation is correct.


Step 3: Critique your checker4.c and heapmgr4.c Implementations

Critique your testheapmgr.c/heapmgr4.c/checker4.c/chunk4.c program using splint. Each time splint generates a warning on your code, you must either (1) edit your code to eliminate the warning, or (2) explain your disagreement with the warning in your readme file.

Similarly, critique your checker4.c and heapmgr4.c code using critTer. Each time critTer generates a warning on your code, you must either (1) edit your code to eliminate the warning, or (2) explain your disagreement with the warning in your readme file.

When critTer critiques the testheapmgr.c file, it generates warnings about the use of "magic numbers," the length of some loops, and the length of the file. We judged that using magic numbers, defining deeply nested code, and defining long loops in testheapmgr.c was clearer than the alternative. And splitting testheapmgr.c into multiple files would have complicated the build process unnecessarily. So please ignore those warnings.
Similarly, when critTer critiques the heapmgr3.c file, it generates a warning which suggests that the HeapMgr_free function must validate its pv parameter through an assert. In fact the HeapMgr_free function must not validate its pv parameter through an assert, so please ignore that warning.

Step 4: Compose checker5.c

Compose your checker5.c implementation. Then issue these commands to build programs with buggy heapmgr5 implementations:

gcc217 -g testheapmgr.c heapmgr5bada.o checker5.c chunk5.c -o test5bada
gcc217 -g testheapmgr.c heapmgr5badb.o checker5.c chunk5.c -o test5badb
gcc217 -g testheapmgr.c heapmgr5badc.o checker5.c chunk5.c -o test5badc
gcc217 -g testheapmgr.c heapmgr5badd.o checker5.c chunk5.c -o test5badd
gcc217 -g testheapmgr.c heapmgr5bade.o checker5.c chunk5.c -o test5bade
gcc217 -g testheapmgr.c heapmgr5badf.o checker5.c chunk5.c -o test5badf
gcc217 -g testheapmgr.c heapmgr5badg.o checker5.c chunk5.c -o test5badg
gcc217 -g testheapmgr.c heapmgr5badh.o checker5.c chunk5.c -o test5badh

Then run those programs using these commands:

./test5bada RandomRandom 1000 20000
./test5badb RandomRandom 1000 20000
./test5badc RandomRandom 1000 20000
./test5badd RandomRandom 1000 20000
./test5bade RandomRandom 1000 20000
./test5badf RandomRandom 1000 20000
./test5badg RandomRandom 1000 20000
./test5badh RandomRandom 1000 20000

If (and only if) each program writes a reasonable descriptive error message and aborts via an assert, then your checker5.c implementation is working properly.

It will be challenging to compose a checker5.c implementation that detects the errors in all of the given heapmgr5bad*.o files. But your checker5.c implementation should detect the errors in many of the given heapmgr5bad*.o files before you proceed to the next step.


Step 5: Compose heapmgr5.c

Compose your heapmgr5.c implementation. Then issue this command:

gcc217 -g testheapmgr.c heapmgr5.c checker5.c chunk5.c -o test5d

to build a debug version of a testing program. Run the program with various command-line arguments to make sure your heapmgr5.c implementation passes all tests defined in your checker5. implementation.

Then issue these commands:

gcc217 -D NDEBUG -O testheapmgr.c heapmgr5.c chunk5.c -o test5
gcc217 -D NDEBUG -O testheapmgr.c heapmgr5good.o chunk5.c -o test5good

Those commands build a release version of a testing program using your heapmgr5.c implementation and a release version of a testing program using the given heapmgr5good.o implementation. Run the two programs with various command-line arguments. If your test5 program consumes exactly the same amount of memory as the test5good program does, and the two sometimes consume approximately the same amount of time, then you can be reasonably sure that your heapmgr5.c implementation is correct.


Step 6: Critique your checker5.c and heapmgr5.c Implementations

Critique your testheapmgr.c/heapmgr5.c/checker5.c/chunk5.c program using splint. Each time splint generates a warning on your code, you must either (1) edit your code to eliminate the warning, or (2) explain your disagreement with the warning in your readme file.

Similarly, critique your checker5.c and heapmgr5.c code using critTer. Each time critTer generates a warning on your code, you must either (1) edit your code to eliminate the warning, or (2) explain your disagreement with the warning in your readme file.


Step 7: Build Using the Given Implementations

Issue these commands to build programs using the given HeapMgr implementations:

gcc217 -D NDEBUG -O testheapmgr.c heapmgr1.c -o test1
gcc217 -D NDEBUG -O testheapmgr.c heapmgr2.c -o test2
gcc217 -D NDEBUG -O testheapmgr.c heapmgr3.c chunk3.c -o test3

Step 8: Compose a readme File

Edit your copy of the given readme file by answering each question that is expressed therein.

In particular, in your readme file record the CPU times and heap memory consumed by the test1, test2, test3, test4, and test5 programs with tests RandomRandom and Worst, with call count 100000, and with maximum chunk sizes 2000 and 20000. Note that if the CPU time consumed is more than 5 minutes, testheapmgr will abort execution. To report the time and memory consumption, paste the output of the testheap script into your readme file.

You must paste the output of the testheap script into your readme file. An alternative approach would be to (1) generate time and memory consumption statistics by running the test1...test5 programs directly from the bash command prompt rather than via the testheap script, and (2) manually type the statistics into your readme file using a format of your choosing. But the alternative would be more difficult for you and your grader. So the alternative approach is unacceptable; you will lose points if you use it.

Step 9: The Challenge Part

You are given the words.c file. The words program reads words from the file whose name is argv[1], and then writes to stdout each distinct word and its occurrence count. Then the program does the same for argv[2], argv[3], and so forth. Finally the program writes CPU time and heap memory usage statistics to stderr.

Answer these questions in your readme file:

  1. Suppose you run the words program with one command-line argument which is pride32.txt. Which implementation (heapmgr1.c, heapmgr2.c, heapmgr3.c, heapmgr4.c, or heapmgr5.c) is the best? The "best" implementation is the one that offers the best balance of time and space efficiency.

  2. Suppose you run the words program with fifteen command-line arguments, each of which is pride32.txt. Which implementation (heapmgr1.c, heapmgr2.c, heapmgr3.c, heapmgr4.c, or heapmgr5.c) is the best? Again, the "best" implementation is the one that offers the best balance of time and space efficiency.

  3. You will find that your answer to Question 1 differs from your answer to Question 2. Why is that so? Explain in terms of the salient characteristics of the words program and the HeapMgr implementations.

You could develop your answers simply by considering the characteristics of the words program and the HeapMgr implementations. But it would be wise to do some experiments to confirm your answers.

To do such experiments, consider using the given symtable.h and symtablehash.o files. The symtable.h file contains the interface of a SymTable ADT. The symtablehash.o file contains an implementation of a SymTable ADT that does not call malloc, calloc, realloc, or free; instead it calls HeapMgr_malloc and HeapMgr_free. You may use those files, along with the given HeapMgr implementations and your HeapMgr implementations, to build some words programs. Then you may analyze CPU time and heap memory usage statistics of the programs by running them using pride32.txt.

The pride32.txt file is large. To avoid the possibility of exhausting your CourseLab disk quota, we recommend that you not copy pride32.txt from the /u/cos217/Assignment6 directory to your working directory. Instead we recommend that you use the file that already exists in the /u/cos217/Assignment6 directory. To do that you'll need to use the absolute file name /u/cos217/Assignment6/pride32.txt as command-line arguments to your words programs.


Step 10: Provide Feedback

Provide the instructors with your feedback on the assignment. To do that, issue this command:

FeedbackCOS217.py 6

and answer the questions that it asks. (When answering the numeric questions, please enter the average of the responses that you and your teammate would provide individually.) That command stores its questions and your answers in a file named feedback in your working directory.


Step 11: Submit

Submit your work electronically on CourseLab. If you worked with a teammate, then one of the teammates must submit all of your team's files, and the other teammate must submit none of your team's files. Your readme file and your source code files must contain both your name and your teammate's name. Use these commands to submit:

submit 6 checker4.c heapmgr4.c
submit 6 checker5.c heapmgr5.c
submit 6 readme feedback

Program Style

In part, good program style is defined by the splint and critTer tools, and by the rules given in The Practice of Programming (Kernighan and Pike) as summarized by the Rules of Programming Style document.

The more course-specific style rules listed in the previous assignment specifications also apply, as does this one: your heapmgr4.c and heapmgr5.c implementations must have good function-level modularity.


Grading

We will grade your work on two kinds of quality:

The thoroughness of your Checker_isValid functions will be a prominent part of your grade.

To encourage good coding practices, we will deduct points if gcc217 generates warning messages.


Keys to Success

There are three keys to success for this assignment:

(1) Function Modularity

Your code must have proper function modularity. The heapmgr3.c implementation defines functions HeapMgr_malloc, HeapMgr_free, HeapMgr_useChunk, and HeapMgr_getMoreMemory. In your heapmgr4.c and heapmgr5.c implementations you must define additional functions that help make your code easier to read and maintain.

We can't suggest the specific additional functions you should define; part of the assignment is that you determine what they should be. We suggest that you study the pseudocode given in precepts. Consider each important verb in the pseudocode. Some candidates for additional functions probably will be obvious.

(2) Internal Testing

Your code must check invariants thoroughly. That is, for each implementation you must define a thorough Checker_isValid function. The Checker_isValid function must check the HeapMgr object's data structures, making sure that those aspects of the data structures that must not vary indeed have not varied. You must call the Checker_isValid function at the leading and trailing edges of your HeapMgr_malloc and HeapMgr_free functions.

The Checker_isValid function for the heapmgr3 implementation defines several checks of invariants. Some of them apply to your implementations; some of them do not. Your implementations must use more complex data structures, so the Checker_isValid functions for your implementations must define more checks of invariants.

It would be a mistake to develop your Checker_isValid functions after developing your heapmgr4.c and heapmgr5.c implementations. That is, it would be a mistake to proceed to Step 2 before doing a thorough job on Step 1, or to proceed to Step 5 before doing a thorough job on Step 4. After all, your Checker_isValid functions will help you to develop your heapmgr4.c and heapmgr5.c implementations.

(3) External Testing

Use the given testheapmgr.c client properly. Run the test4 and test4good programs repeatedly, progressing from simple tests (e.g., LifoFixed) to more complex ones (e.g., RandomRandom), from small numbers of calls of HeapMgr_malloc and HeapMgr_free to large number of calls, and from small (maximum) chunk sizes to large (maximum) chunk sizes. For each run compare the performance — especially the memory consumption — of the two programs. Do the same for test5 and test5good.


This assignment was written by Robert M. Dondero, Jr. and Iasonas Petras.