Grant Wallace
CS 597d
 

Observation 6:

Sentinel Scheduling for VLIW and Superscalar Processors [1]

 
The difference between S and R (sentinel and restricted scheduling) in figure 4 is probably primarily due to speculative scheduling of memory load instructions in S. The integer benchmarks probably don’t have many float or divide instruction and no stores are speculatively issued in S. This leaves loads since R can speculatively schedule non-excepting instructions.
 
The 2-issue S and R models don’t always have an appreciable difference (fig. 4). This suggests that the R model is able to keep the pipeline fairly full even without speculative loads.
 
Based on figure 5 it seems that speculative stores aren’t worth the extra hardware or complexity.
 
I’d be interested in seeing a comparison between S and various levels of the instruction boosting model. [1] states that general percolation achieves nearly the same performance as instruction boosting, but they don’t mention how many levels of branches the instruction boosting did speculation for. It would be interesting to see for what value N (number of branch levels used for boosting) does speedup begin to reach an asymptote for instruction boosting. This would be about the limiting speedup due to speculation (with current branch prediction methods).
 
[1] Mahlke, Chen, Hwu, Rau, Schlansker, "Sentinel Scheduling for VLIW and Superscalar Processors", Proceedings of the Fifth Int'l Conference on Architecture Support for Programming Languages and Operating Systems (ASPLOS-V), Boston, MA, Oct. 12-15, 1992, pp.238-247