The expanding gap between microprocessor and DRAM performance has necessita
ted the use of increasingly aggressive techniques designed to reduce or hid
e the latency of main memory access. Although large cache hierarchies have
proven to be effective in reducing this latency for the most frequently use
d data, it is still not uncommon for many programs to spend more than half
their run times stalled on memory requests. Data prefetching has been propo
sed as a technique for hiding the access latency of data referencing patter
ns that defeat caching strategies. Rather than waiting for a cache miss to
initiate a memory fetch, data prefetching anticipates such misses and issue
s a fetch to the memory system in advance of the actual memory reference. T
o be effective, prefetching must be implemented in such a way that prefetch
es are timely, useful, and introduce little overhead. Secondary effects suc
h as cache pollution and increased memory bandwidth requirements must also
be taken into consideration. Despite these obstacles, prefetching has the p
otential to significantly improve overall program execution time by overlap
ping computation with memory accesses. Prefetching strategies are diverse,
and no single strategy has yet been proposed that provides optimal performa
nce. The following survey examines several alternative approaches, and disc
usses the design tradeoffs involved when implementing a data prefetch strat
egy.