Princeton University
Computer Science Dept.

Computer Science 597d
Advanced Topics in Computer Science: Synergistic Hardware-Compiler Architecture Design

David August

Fall 1999


Optimization Projects


Grant Wallace

To: august@CS.Princeton.EDU
Subject: optimization project

I have completed a series of optimizations as follows...

gopti3 - implements FRP starting with your opti5 code. No performance improvement.

gopti5 - starting from the orig code, I tried to compact the branches together. I noticed that the limiting factor seemed to be there was a data dependency between iterations based on the gotspace register which determines if a space character was seen in the last iteration. I broke this dependency by creating a separate gotspace register for each iteration. This decreased the cycle count from 289353 (orig) to 210456 (gopti5).

gopti6 - In this optimization I combine FRP (gopti3) with gotspace register renaming (gopti5). This gave an additional performance increase bringing the best count down to 171003.

gopti7 - I noticed that the code wasn't well packed in gopti6 and I thought the limiting factor might be the data dependence formed from the word_count and line_count accumulators. So I duplicated the accumulators and did register renaming to give each iteration it's own accumulator. This didn't help much and actually slowed the loop down from 171003 to 171020.

gopti8 - Startin from gopti7, I thought the limiting factor might be the boolean optimization, since you only implemented that for iters 1-4. So I added the boolean optimizations to iters 5-8. This actually helped quite a bit as it reduced the dependence height of the predicate calculations. The cycle count was brought down to 144722.

I have attached a tar-zip file of the gopti directories. There is a brief README file in each directory showing the cycle count and schedule.

Thanks,
Grant.

Grant's Schedule (without giving it away to those who are not done)


  orig/wc.HS_im_p.input1.stat:(best_cycle_count 289353.000000)
gopti3/wc.HS_im_p.input1.stat:(best_cycle_count 236751.000000)
gopti5/wc.HS_im_p.input1.stat:(best_cycle_count 210456.000000)
gopti6/wc.HS_im_p.input1.stat:(best_cycle_count 171003.000000)
gopti7/wc.HS_im_p.input1.stat:(best_cycle_count 171020.000000)
gopti8/wc.HS_im_p.input1.stat:(best_cycle_count 144722.000000)


-----------------+---------------------------------------------------------------------------------------------------------------
 cb 35 13150.0   |        55
  -> 42  0.0     |        56
  -> 42  0.0     |        57
  -> 42  0.0     |        58
  -> 42  1.0     |        59
  -> 42  0.0     |        60
  -> 42  0.0     |        61
  -> 42  0.0     |        62
  -> 35  1.0     |        63
  -> 42  13148.0 |        64
                 |        65
                 |        66
                 |        67
                 |        68
                 |        69
                 |        70
                 |   0    81     72     75     73     74     71L     -      -     -     -     -     -     -     -     -      -
                 |   1    94     85     88     87     82     86     80     77     79    76    84L   -     -     -     -      -
                 |   2   107     90     98    101    100     95     93     99     89    78    92    97L   -     -     -     83J
                 |   3   120    103    102    111    114    113    106    108    112    91   105   110L   -     -     -     96J
                 |   4   132    116    104    123    126    125    124    119    118   115   122L   -     -     -     -    109J
                 |   5   144    128    127    135    138    137    131    136    134L  117   130    -     -     -     -    121J
                 |   6   147    150    149    156    148    140    143    139    129   142   146L   -     -     -     -    133J
                 |   7   152    159    162    161    155    151    158L   168    160   141   154    -     -     -     -    145J
                 |   8   163    153    164    166    167    170    171    172    173   174   175   176   177    -     -    157J
                 |   9   165    178      -      -      -      -      -      -     -     -     -     -     -     -     -    169J
                 |  10   179      -      -      -      -      -      -      -     -     -     -     -     -     -     -    180J
-----------------+---------------------------------------------------------------------------------------------------------------

I know there are better schedules out there.... Firm due date is January 19th.