Hardware and compiler-directed cache coherence in large-scale multiprocessors: Design considerations and performance study

Authors
Citation
L. Choi et Pc. Yew, Hardware and compiler-directed cache coherence in large-scale multiprocessors: Design considerations and performance study, IEEE PARALL, 11(4), 2000, pp. 375-394
Citations number
40
Categorie Soggetti
Computer Science & Engineering
Journal title
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
ISSN journal
10459219 → ACNP
Volume
11
Issue
4
Year of publication
2000
Pages
375 - 394
Database
ISI
SICI code
1045-9219(200004)11:4<375:HACCCI>2.0.ZU;2-E
Abstract
In this paper, we study a hardware-supported, compiler-directed (HSCD) cach e coherence scheme, which can be implemented on a large-scale multiprocesso r using off-the-shelf microprocessors, such as the Gray T3D. The scheme can be adapted to various cache organizations, including multiword cache lines and byte-addressable architectures. Several system related issues, includi ng critical sections, interthread communication, and task migration have al so been addressed. The cost of the required hardware support is minimal and proportional to the cache size. The necessary compiler algorithms, includi ng intra- and interprocedural array data flow analysis, have been implement ed on the Polaris parallelizing compiler [34]. From our simulation study us ing the Perfect Club benchmarks [5], we found that in spite of the conserva tive analysis made by the compiler, for four of six benchmark programs test ed, the proposed HSCD scheme outperforms the full-map hardware directory sc heme up to 70 percent while the hardware scheme outperforms the HSCD scheme in the remaining two applications up to 89 percent. Given its comparable p erformance and reduced hardware cost, the proposed scheme can be a viable a lternative for large-scale multiprocessors such as the Gray T3D, which rely on users to maintain data coherence.