Grant Wallace
CS 597d
Observation 10:
Effective Compiler Support for Predicated Execution Using the Hyperblock
[1]
Strengths:
Allows for selective combination of basic blocks
into a single if-converted (predicated) block structure (Hyperblock). This
can help improve performance by limiting the number of paths of execution
which must be issued, and allow for better compiler optimizations (compared
to combining all basic blocks into one predicated structure). Basic blocks
for inclusion are given priority based on smaller size, greater execution
frequency, and fewer hazardous instructions (procedure calls, unresolved
memory accesses).
Allows for speculative code motion using techniques
like instruction promotion, instruction merging.
Reduces branch mispredictions because there are
fewer branches (if-converted).
Makes more complete use of a processors issue width
(as compared to superblocks), and so on larger issue processors it outperforms
superblocks.
Weaknesses:
Will perform worse than superblock formations on
lower issue width processors because many of the instructions which must
be issued are predicated false (essentially no-ops).
It’s interesting that hyperblocks with all execution
paths (IP) and hyperblocks with selected execution paths (PP) often don’t
show significant performance variation on 2-issue processors. Sometimes
it’s not until 8-issue that a marked difference is seen. I would have expected
IP to perform worse on 2-issue processors as compared to PP, and then to
slowly catch up as the processor width is increased.
The node-splitting algorithm concentrates on finding
merge points where one path has a large flow and small size, and the other
path has a small flow and large size. This is the worst case because the
large flow path must wait for the instruction issue of the large size path.
So these merge points are targeted for node splitting.
For basic block inclusion, the algorithm looks for
bbs which are small, executed frequently, and have few hazardous instructions.
Large bbs have more opportunity for optimizations even without inclusion.
Hazardous instructions decrease the overall code motion and optimizations
possible in that bb.
Question:
Section 3.5 mentions compiler optimizations across
multiple paths. What types of optimizations are these? Would instruction
merging fall into this category?
[1] Scott A. Mahlke, ... "Effective Compiler Support for Predicated
Execution Using the Hyperblock", Proceedings of the 25th International
Symposium on Microarchitecture, December, 1992.