Mk. Tcheun et al., A SIMPLE HARDWARE PREFETCHING SCHEME USING SEQUENTIALITY FOR SHARED-MEMORY MULTIPROCESSORS, IEICE transactions on information and systems, E80D(11), 1997, pp. 1055-1063
To reduce the memory access latency on shared-memory multiprocessors,
several prefetching schemes have been proposed. The sequential prefetc
hing scheme is a simple hardware-controlled scheme, which exploits the
sequentiality of memory accesses to predict which blocks will be read
in the near Future. Aggressive sequential prefetching prefetches many
blocks on each miss to reduce the miss rates and results in good perf
ormance For application programs with high sequentiality. However, con
servative sequential prefetching prefetches a few blocks on each miss
to avoid prefetching of useless blocks, which shows better performance
than aggressive sequential prefetching for application programs with
low sequentiality. We analyze the relationship between the sequentiali
ty of application programs and the effectiveness of sequential prefetc
hing on various memory and network latency and propose a new adaptive
sequential prefetching scheme. Simply adding a small table to the sequ
ential prefetching scheme, the proposed scheme prefetches a large numb
er of blocks for application programs with high sequentiality and redu
ces the miss rates significantly, and prefetches a small number of blo
cks for application programs with low sequentiality and avoids loading
useless blocks.