In the paper "Accurate and practical profile-driven compilation
using
the profile buffer", the authors propose using a compiler driven
scheme to index a profile buffer. Even so, the compiler may not
have
enough buffers to properly allocate accross a branch intensive section
of code. The net aliasing can result in inaccurate profile results
that cause poor code generation in the compiler. The ISA could
simply
include instructions for the program to directly read and write the
profile buffer entries as special registers. Then, spill code
for the
profile entries could be generated as needed. This also removes
the
need for OS trapping to capture profile information.
A second alternative is to simply augment the profile buffer with
instrumentation code. This will be slower than a profile buffer
scheme but with the addition of the profile buffer the instrumentation
can be placed on the least traversed branches, such as those of outer
loops, etc.
Manish