H. Ando et al., SPECULATIVE EXECUTION AND REDUCING BRANCH PENALTY ON A SUPERSCALAR PROCESSOR, IEICE transactions on electronics, E76C(7), 1993, pp. 1080-1093
Superscalar processors improve performance by exploiting instruction-l
evel parallelism (ILP). ILP in a basic block is, however, not sufficie
nt on non-numerical applications for gaining substantial speedup. Inst
ructions across branches are required to be executed in parallel to dr
amatically improve performance. That is, speculative execution is stro
ngly required. Boosting is a general solution to achieving speculative
execution. Boosting labels an instruction to be speculatively execute
d, and the hardware handles side-effects. This paper describes the eff
icient implementation of boosting in terms of cost/performance trade-o
ffs. Our policy in implementation is beneficial in code scheduling heu
ristics, penalties imposed by code duplication to maintain program sem
antics, and area cost. This paper also describes a branch scheme which
minimizes branch penalty. Branch delay causes crucial penalties on th
e performance of superscalar processors since multiple delay slots exi
st even in a single delay cycle. Our scheme is the fetching of both se
quential and target instructions, and either of them is selected on a
branch. No delay cycle can be imposed. This scheme is realized by a co
mbination of static code movement and hardware support. As a result, w
e reduce branch penalty with small cost. Simulation results show that
our ideas are highly effective in improving the performance of a super
scalar processor.