A comparison of C, ML, Java on a realistic benchmark

Andrew W. Appel, September 25, 1998

It's difficult to benchmark the execution speeds of different programming languages, since it's rare to find "the same program" written in different languages. In most cases such comparisons are based on tiny programs written specifically as benchmark kernels. A better interlanguage benchmark should use a variety of algorithms and data structures to solve a real problem.

My compiler textbooks [Appel98c,Appel98j,Appel98m] outline the structure of a "Tiger" compiler, a real compiler (for a simple language), implemented in C, ML, and Java. For pedagogical purposes, I made every effort to make all three implementations follow the same structure and use similar data representations and algorithms. However, I also tried to make the three compilers use idiomatic style (where reasonable) in each implementation language.

Some natural questions arise: Are the three languages all suited to writing compilers? How fast do the programs run? How expressive are the languages, that is, how many lines of code did it take in each language?

Language Lines of code Run time Compiler

C 5440 4.88 sec gcc 2.7.2.3 -O

ML 2666 9.48 sec sml/nj 110.0.3

Java 4482 21.98 sec jdk 1.1.6 -O

Language	Lines of code	Run time	Compiler
C	5440	4.88 sec	gcc 2.7.2.3 -O
ML	2666	9.48 sec	sml/nj 110.0.3
Java	4482	21.98 sec	jdk 1.1.6 -O

All timings reported in this report are run on a 200 MHz 512-Megabyte dual-processor UltraSparc-2, Solaris 5.5.1; the test input is a 1393-line Tiger program (41 copies of queens.tig concatenated).

The benchmark program is a small but high-tech compiler for Tiger, a simple Algol-like programming language with Java-like data structures. The principal phases are lexical analysis (using an automatically generated lexer); parsing (using an automatically generated LALR(1) parser); abstract-syntax-tree construction; semantic analysis and translation to intermediate-representation trees; rewriting the IR trees to pull statements out of expressions; instruction selection; liveness analysis by global static dataflow analysis; construction of register-interference graph; graph-coloring register allocation; emission of assembly code.

Line counts include input specifications to parser and lexer generators but not generated output, which is 2387, 4764, 2684 lines respectively for C, ML, Java. Parsing and lexical analysis times are influenced not only by the quality of programming language and compiler, but also by the quality of the code emitted by the parser generators. Lexing/parsing time was 0.08, 0.47, 2.85 seconds for C, ML, Java.

In writing the programs I found that it is indeed possible to express this program in each of the three languages without undue difficulty. However, representation of data structures (especially abstract-syntax trees and IR trees) is much more tedious in C and Java. This problem could in principle be solved using an appropriate program-generation tool in C or Java, such as the Zephyr ASDL [Wang97], which might reduce the line-counts of C and Java by 690 and 450 lines respectively.

Even so, there are other ways in which ML is well suited to this application domain, and the ML's expressiveness (as shown by the Lines of Code column) is a significant advantage.

Graph representation: The Tiger compiler as described in my textbooks use an abstract view of graphs (for flowgraphs and interference graphs). Preliminary profiling showed that this was a bottleneck for all three implementations. I didn't want to compare three inefficient programs against each other, so I modified the FlowGraph, Liveness, and Color modules (in each compiler) to use a more concrete representation for directed graphs [Appel98x]. Run times without this modification were 19, 65, 194 seconds for C, ML, Java, respectively.

The C program uses malloc() but not free(). With the default implementation of malloc/free it runs in 5.65 seconds. With the Boehm garbage collector [Boehm96] attached, it runs in 4.88 seconds.

"JDK 1.1.6 for Solaris features ... improved scalability and performance due to the inclusion of a highly optimizing JIT compiler" [Sun98]. Execution of the JIT (converting bytecode to native code at the beginning of execution) took 0.37 seconds, included in all Java times given above.

Conclusion: In a realistic program that has not been extensively tuned (but which uses reasonable algorithms in general), C runs twice as fast as ML, but the C program has twice as many lines of code. Java's line-count is close to C, but Java's performance is much worse than ML's.

Bibliography

[Appel98c]: Andrew W. Appel, Modern Compiler Implementation in C. Cambridge, 1998.
[Appel98j]: Andrew W. Appel, Modern Compiler Implementation in Java. Cambridge, 1998.
[Appel98m]: Andrew W. Appel, Modern Compiler Implementation in ML. Cambridge, 1998.
[Appel98x]: Andrew W. Appel, An alternate graph representation for the Tiger compiler, May 1998.
[Boehm96]: Hans-J. Boehm, A garbage collector for C and C++, 1996.
[Sun98]: Java Development Kit for Solaris, September 23, 1998.
[Wang97]: Daniel C. Wang, Andrew W. Appel, Jeff L. Korn, and Christopher S. Serra. The Zephyr Abstract Syntax Description Language. Conference on Domain-Specific Languages, USENIX Association, October 1997.