Emulating Write-Allocate on a No-Write-Allocate Cache
A cache write-miss policy of write-allocate leads to better
performance for many programs than a policy of no-write-allocate.
I show a compiling technique to emulate a write-allocate policy even
on a no-write-allocate machine, if that machine has non-blocking cache
read misses (e.g. DEC Alpha 21064).
The trick is to pre-fetch garbage before the write, so that
the cache line will be allocated and the write will hit the cache.
Measurements show a factor-of-1.2 improvement in execution time.