Princeton University
COS 217:  Introduction to Programming Systems

Assignment 2:  A Set ADT

Purpose

The purpose of this assignment is to help you learn how to create Abstract Data Types (ADTs) via the C programming language.  It will also give you the opportunity to learn more about the GNU/UNIX programming tools, especially bash, Emacs, gcc, gdb, and make.

Background

A set is a collection of bindings.  Each binding consists of a key and a value.  A key uniquely identifies its binding; a value is data that is somehow pertinent to its key. 

Programming systems use sets often.  For example compilers and assemblers use sets to relate symbols to their meanings.

Your Task

Your task in this assignment is to create ADTs that implement the "set" concept.  More precisely, your job is to create:

The Set Interface

The Set  interface should contain these function declarations:

Set_T       Set_new(int iEstimatedLength, 
                    int (*pfCompare)(const void *pvKey1, const void *pvKey2));
void        Set_free(Set_T oSet);
void        Set_clear(Set_T oSet);
int         Set_getLength(Set_T oSet);
int         Set_put(Set_T oSet, const void *pvKey, void *pvValue);
int         Set_remove(Set_T oSet, const void *pvKey);
const void *Set_getKey(Set_T oSet, const void *pvKey);
void       *Set_getValue(Set_T oSet, const void *pvKey);
void        Set_map(Set_T oSet, 
                    void (*pfApply)(const void *pvKey, void **ppvValue, void *pvExtra),
                    void *pvExtra);

The Set_new function should return a new Set.  The parameter iEstimatedLength is an estimate of the maximum number of  bindings that the Set will contain.  The parameter pfCompare is a pointer to a compare function that the Set will use to compare keys.  The function *pfCompare should return a negative integer if there is a "less than" relationship between the two given keys, a positive integer if there is a "greater than" relationship between the two given keys, and zero if the two given keys are equal.  It should be a checked runtime error for iEstimatedLength to be less than 1, or for pfCompare to be NULL.

The Set_free function should free the dynamically allocated memory in which oSet is residing.  It should be a checked runtime error for oSet to be NULL.

The Set_clear function should remove all bindings from oSet.  It should be a checked runtime error for oSet to be NULL.

The Set_getLength function should return the number of bindings in oSet. It should be a checked runtime error for oSet to be NULL.

The Set_put function should add a new binding to oSet consisting of key *pvKey and value *pvValue.  It should reject the new binding if a binding whose key equals *pvKey already exists in oSet.  It should return 1 (TRUE) if successful, and 0 (FALSE) otherwise. It should be a checked runtime error for oSet or pvKey to be NULL.

The Set_remove function should remove from oSet the binding whose key equals *pvKey.  It should return 1 (TRUE) if successful, and 0 (FALSE) otherwise (that is, if no such binding exists).  It should be a checked runtime error for oSet or pvKey to be NULL.

The Set_getKey function should return the address of the key of the binding within oSet whose key equals *pvKey, or NULL if no such binding exists.  It should be a checked runtime error for oSet or pvKey to be NULL.

The Set_getValue function should return the address of the value of the binding within oSet whose key equals *pvKey, or NULL if no such binding exists. It should be a checked runtime error for oSet or pvKey to be NULL.

The Set_map function should apply function *pfApply to each binding in oSet.  That is, for each binding in oSet consisting of key pvKey and value pvValue, the function should call (*pfApply)(pvKey, &pvValue, pvExtra).  The function (*pfApply) should be applied to the bindings in increasing order as determined by the Set's compare function.  It should be a checked runtime error for oSet or pfApply to be NULL.

A checked runtime error should abort program execution via an assert function call.

The SetIter Interface

A SetIter should be an "iterator"; it should allow the user to iterate over the bindings of a specified Set, while hiding the implementation of the Set from the user.  At any given time, a SetIter is either in an invalid state, or it has selected a particular binding of its Set.  A SetIter should iterate over its Set's bindings in increasing order as specified by the Set's compare function.  A SetIter may assume that no bindings will be added to or removed from its Set during the "lifetime" of the SetIter.

The SetIter  interface should contain these function declarations:

SetIter_T   SetIter_new(Set_T oSet); 
void        SetIter_free(SetIter_T oSetIter); 
void        SetIter_selectFirst(SetIter_T oSetIter);
void        SetIter_selectNext(SetIter_T oSetIter);
int         SetIter_selectionIsValid(SetIter_T oSetIter);
const void *SetIter_getSelectedKey(SetIter_T oSetIter);
void       *SetIter_getSelectedValue(SetIter_T oSetIter);

The SetIter_new function should return a new SetIter that can be used to iterate over oSet's bindings.  After the function executes, the SetIter should be in an invalid state.  It should be a checked runtime error for oSet to be NULL.

The SetIter_free function should free the dynamically allocated memory in which oSetIter is residing.  It should be a checked runtime error for oSetIter to be NULL.

The SetIter_selectionIsValid function should return 1 (TRUE) if oSetIter is in a valid state, and 0 (FALSE) otherwise.  It should be a checked runtime error for oSetIter to be NULL.

The SetIter_selectFirst function should select the first binding of oSetIter's Set.  If there is indeed a first binding, then the function should place oSetIter into a valid state; if the Set is empty, the function should leave oSetIter in an invalid state.  It should be a checked runtime error for oSetIter to be NULL.

The SetIter_selectNext function should select the next binding of oSetIter's Set.  If there is no next binding, then the function should place oSetIter into an invalid state.  It should be a checked runtime error for oSetIter to be NULL, or for oSetIter to be in an invalid state when the function is called.

The SetIter_getSelectedKey function should return the address of the key of oSetIter's selected binding.  It should be a checked runtime error for oSetIter to be NULL, or for oSetIter to be in an invalid state.

The SetIter_getSelectedValue function should return the address of the value of oSetIter's selected binding.  It should be a checked runtime error for oSetIter to be NULL, or for oSetIter to be in an invalid state.

As with the Set interface, a checked runtime error should abort program execution via an assert function call.

The Dynamic Array Implementation

In the dynamic array implementation, the Set_new function should dynamically allocate arrays that contain exactly iEstimatedLength keys and values.  Then, when necessary, the Set_put function should expand those arrays; it should do so by doubling their size.  (The Set_remove function need not shrink the arrays.)  You will find the realloc function useful for expanding the arrays.

The array of keys should be kept sorted at all times.  The Set_getKey, Set_getValue, and Set_remove functions should use a binary search to find the specified key.  Similarly, the Set_put function should use a binary search to make sure that the given key does not already exist in the Set.

The Binary Search Tree Implementation

In the binary search tree implementation, the Set_new function will not use the iEstimatedLength parameter.  For simplicity, the Set_put function need not keep the binary search tree balanced.

Hint:  You may find it necessary to implement your binary search tree such that each node not only contains pointers to its left and right children, but also contains a pointer to its parent.  In particular, the SetIter_selectNext function may rely on parent pointers.

Logistics

You should develop on arizona. Use Emacs to create source code. Use the "make" tool to automate the build process. Use gdb to debug.

The file /u/cs217/Assignment2/testset.c contains a minimal test program that illustrates typical usage of the Set and SetIter ADTs.  You may use that test program as the basis for defining your own tests.

You should create five text files for submission: set.h, setarray.c, settree.c, makefile, and readme. The set.h file should contain the interfaces of your Set and SetIter ADTs.  The setarray.c file should contain the array implementation of the two ADTs.  The settree.c file should contain the tree implementation of the two ADTs.  

The makefile file should contain commands to the "make" program indicating how it should build your program.  In particular, when the user types "make" (with no command-line arguments), your makefile should instruct the make program to build four executable files:

As always, the readme file should contain your name, and any information that will help us to grade your work in the most favorable light.  In particular you should describe all known bugs in your readme file.

Submit your work electronically via the command /u/cs217/bin/submit 2 set.h setarray.c settree.c makefile readme.

Grading

We will grade your work on correctness.  Understandability will also be a major component of your grade; please see the lecture notes and the statement of Assignment 1 for some guidelines concerning program understandability.  To encourage good coding practices, you will lose points for any warning messages generated by gcc during the compilation of your work.