A trace cache microarchitecture and evaluation

Citation
E. Rotenberg et al., A trace cache microarchitecture and evaluation, IEEE COMPUT, 48(2), 1999, pp. 111-120
Citations number
29
Categorie Soggetti
Computer Science & Engineering
Journal title
IEEE TRANSACTIONS ON COMPUTERS
ISSN journal
00189340 → ACNP
Volume
48
Issue
2
Year of publication
1999
Pages
111 - 120
Database
ISI
SICI code
0018-9340(199902)48:2<111:ATCMAE>2.0.ZU;2-7
Abstract
As the instruction issue width of superscalar processors increases, instruc tion fetch bandwidth requirements will also increase. It will eventually be come necessary to fetch multiple basic blocks per clock cycle. Conventional instruction caches hinder this effort because long instruction sequences a re not always in contiguous cache locations. Trace caches overcome this lim itation by caching traces of the dynamic instruction stream, so instruction s that are otherwise noncontiguous appear contiguous. In this paper, we pre sent and evaluate a microarchitecture incorporating a trace cache. The micr oarchitecture provides high instruction fetch bandwidth with low latency by explicitly sequencing through the program at the higher level of traces, b oth in terms of 1) control flow prediction and 2) instruction supply. For t he SPEC95 integer benchmarks, trace-level sequencing improves performance f rom 15 percent to 35 percent over an otherwise equally sophisticated, but c ontiguous, multiple-block fetch mechanism. Most of this performance improve ment is due to the trace cache. However, for one benchmark whose performanc e is limited by branch mispredictions, the performance gain is almost entir ely due to improved prediction accuracy.