Accurate instruction fetch and branch prediction is increasingly important
in today's superscalar architectures. Fetch prediction is the process of de
termining the next instruction to request from the memory subsystem. Branch
prediction is the process of predicting the likely outcome of branch instr
uctions. A branch target buffer (BTB) is often used to provide target addre
sses for taken branches and to predict the destination of indirect jumps. U
sing a BTB avoids the delay needed to recalculate the destination address a
nd reduces the misfetch penalty. However, an effective branch target buffer
can be large and can possibly increase the cycle time of a processor. We p
ropose that a design used in older computers, such as the PDP-8, be used in
modern architectures instead of a BTB design. The compiler would precomput
e the branch destination for most branch instructions, allowing the branch
information to be stored with the instruction. We consider computing branch
destinations at link time and as instructions are fetched into the instruc
tion cache; both alternatives offer similar performance with different adva
ntages. A very small BTB is still useful to predict indirect branches, whic
h cannot be pre-computed. Our results show that the Precomputed-Branch arch
itecture performs better than an architecture using only a BTB, and has sig
nificant hardware savings. This is particularly true for larger programs mo
re representative of modern applications. (C) 1999 Elsevier Science B.V. Al
l rights reserved.