
Performance Scalability of Decoupled Software Pipelining [abstract] (PDF)
Ram Rangan, Neil Vachharajani, Guilherme Ottoni, and David I. August
ACM Transactions on Architecture and Code Optimization (TACO), Volume 5, Number 2, August 2008.
Any successful solution to using multi-core processors to scale
general-purpose program performance will have to contend with rising
inter-core communication costs while exposing coarsegrained
parallelism. Recently proposed pipelined multithreading (PMT)
techniques have been demonstrated to have general-purpose
applicability and are also able to effectively tolerate intercore
latencies through pipelined inter-thread communication. These
desirable properties make PMT techniques strong candidates for program
parallelization on current and future multi-core processors and
understanding their performance characteristics is critical to their
deployment. To that end, this paper evaluates the performance scalability of a
general-purpose PMT technique called decoupled software pipelining
(DSWP) and presents a thorough analysis of the communication
bottlenecks that must be overcome for optimal DSWP scalability.
|
|