An architectural framework for runtime optimization

Citation
Mc. Merten et al., An architectural framework for runtime optimization, IEEE COMPUT, 50(6), 2001, pp. 567-589
Citations number
34
Categorie Soggetti
Computer Science & Engineering
Journal title
IEEE TRANSACTIONS ON COMPUTERS
ISSN journal
00189340 → ACNP
Volume
50
Issue
6
Year of publication
2001
Pages
567 - 589
Database
ISI
SICI code
0018-9340(200106)50:6<567:AAFFRO>2.0.ZU;2-L
Abstract
Wide-issue processors continue to achieve higher performance by exploiting greater instruction-level parallelism. Dynamic techniques such as out-of-or der execution and hardware speculation have proven effective at increasing instruction throughput. Runtime optimization promises to provide an even hi gher level of performance by adaptively applying aggressive code transforma tions on a larger scope. This paper presents a new hardware mechanism for g enerating and deploying runtime optimized code. The mechanism can be viewed as a filtering system that resides in the retirement stage of the processo r pipeline, accepts an instruction execution stream as input, and produces instruction profiles and sets of linked, optimized traces as output. The co de deployment mechanism uses an extension to the branch prediction mechanis m to migrate execution into the new code without modifying the original cod e. These new components do not add delay to the execution of the program ex cept during short bursts of reoptimization. This technique provides a stron g platform for runtime optimization because the hot execution regions are e xtracted, optimized, and written to main memory for execution and because t hese regions persist across context switches. The current design of the fra mework supports a suite of optimizations, including partial function inlini ng (even into shared libraries), code straightening optimizations, loop unr olling, and peephole optimizations.