Through simulations, the effect of several microarchitectural parameters on
the performance of a dynamic out-of-order executing microprocessor is show
n. Next, we show that memory instructions, especially stores, limit the ava
ilable instruction level parallelism (ILP) considerably Techniques are prop
osed to mitigate the memory instructions effect: A statical, a mixed static
al/dynamical and a fully dynamical technique are proposed. We focus on the
fully dynamical technique which enables the out-of-order execution of loads
/stores. If a memory dependence fault is detected, the traditional branch m
isprediction recovery hardware is used for recovery. Since this scheme is n
ot very performant, a dependence-fault predicting cache is introduced. (C)
1999 Elsevier Science B.V. All rights reserved.