Published on *Computer Science Department at Princeton University* (http://www.cs.princeton.edu)

The microprocessor industry has embraced multicore architectures as the new

dominant design paradigm. Harnessing the full power of such computers requires

writing multithreaded programs, but regardless of whether one is writing a program

from scratch or porting an existing single-threaded program, concurrency is hard to

implement correctly and often reduces the legibility and maintainability of the source

code. Single-threaded programs are easier to write, understand, and verify.

Parallelizing compilers oer one solution by automatically transforming sequential

programs into parallel programs. Assisting the programmer with challenging tasks

like this (or other optimizations), however, causes compilers to be highly complex.

This leads to bugs that add unexpected behaviors to compiled programs in ways that

are very difficult to test. Formal compiler verification adds a rigorous mathematical

proof of correctness to a compiler, which provides high assurance that successfully

compiled programs preserve the behaviors of the source program such that bugs are

not introduced. However, no parallelizing compiler has been formally verified.

We lay the groundwork for verified parallelizing compilers by developing a general

theory to prove the soundness of parallelizing transformations. Using this theory, we

prove the soundness of a framework of small, generic transformations that compose

together to build optimizations that are correct by construction. We demonstrate

it by implementing several classic and cutting-edge loop-parallelizing optimizations:

DOALL, DOACROSS, and Decoupled Software Pipelining. Two of our main contributions

are the development and proof of a general parallelizing transformation and a

transformation that coinductively folds a transformation over a potentially nonterminating

loop, which we compose together to parallelize loops. Our third contribution

is an exploration of the theory behind the correctness of parallelization, where we

consider the preservation of nondeterminism and develop bisimulation-based proof

techniques. Our proofs have been mechanically checked by the Coq Proof Assistant.