B. Weissman, PERFORMANCE COUNTERS AND STATE SHARING ANNOTATIONS - A UNIFIED APPROACH TO THREAD LOCALITY, ACM SIGPLAN NOTICES, 33(11), 1998, pp. 127-138
This paper describes a combined approach for improving thread locality
that uses the hardware performance monitors of modern processors and
program-centric code annotations to guide thread scheduling on SMPs. T
he approach relies on a shared state cache model to compute expected t
hread footprints in the cache on-line. The accuracy of the model has b
een analyzed by simulations involving a set of parallel applications.
We demonstrate how the cache model can be used to implement several pr
actical locality-based thread scheduling policies with little overhead
. Active Threads, a portable, high-performance thread system, has been
built and used to investigate the performance impact of locality sche
duling for several applications.