HUFFMAN CODING ---------------- Compilation Instructions: ------------------------------------------------------------------------------- Both huff.c and dhuff.c #include tree.h. huff.c and dhuff.c are both independent programs and each should be compiled together with tree.c, making sure that the math library (math.h) is properly included by the compiler. The source has compiled under Sun/Solaris using gcc and lcc and as a SIOW application under MacOS (the code needs to be modified due to the inability to pass command-line arguments to a MacOS app.) using the latest MPW-GM release from Apple. For gcc: gcc -lm huff.c tree.c gcc -lm dhuff.c tree.c -------------------------------------------------------------------------------- huff.c has been tested on both ASCII text and binary files and appears to work. It works by constructing Huffman trees, which is actually NOT the most efficient way to perform Huffman coding, but it is the obvious way to go about it. The code has acceptable running times on relatively large files - i.e., the Project Gutenberg edition of the King James Bible, which is about 4.1 MB in size - and decoding is about 5 times faster than encoding. Decoding is accomplished by reconstructing the original Huffman tree from the original frequency data, which is embedded at the top of each compressed file. This information is stored in 5 byte chunks - one for each character appearing in the source file. I picked this size because it easily stores all the frequency data for the Bible, which is a rather large text file. If you have a file with frequencies larger than 4 bytes, you're in trouble. This could be fixed by scaling frequency counts to fit into one byte - just map the highest frequency to the number 255 and scale accordingly.