Grant Wallace
CS 597d
Observation 6:
Sentinel Scheduling for VLIW and Superscalar Processors [1]
The difference between S and R (sentinel and restricted scheduling)
in figure 4 is probably primarily due to speculative scheduling of memory
load instructions in S. The integer benchmarks probably don’t have many
float or divide instruction and no stores are speculatively issued in S.
This leaves loads since R can speculatively schedule non-excepting instructions.
The 2-issue S and R models don’t always have an appreciable difference
(fig. 4). This suggests that the R model is able to keep the pipeline fairly
full even without speculative loads.
Based on figure 5 it seems that speculative stores aren’t worth the
extra hardware or complexity.
I’d be interested in seeing a comparison between S and various levels
of the instruction boosting model. [1] states that general percolation
achieves nearly the same performance as instruction boosting, but they
don’t mention how many levels of branches the instruction boosting did
speculation for. It would be interesting to see for what value N (number
of branch levels used for boosting) does speedup begin to reach an asymptote
for instruction boosting. This would be about the limiting speedup due
to speculation (with current branch prediction methods).
[1] Mahlke, Chen, Hwu, Rau, Schlansker, "Sentinel Scheduling for VLIW
and Superscalar Processors", Proceedings of the Fifth Int'l Conference
on Architecture Support for Programming Languages and Operating Systems
(ASPLOS-V), Boston, MA, Oct. 12-15, 1992, pp.238-247