ITA
ENG

Multigrain shared memory

Authors

Yeung, D Kubiatowicz, J Agarwal, A

Citation

D. Yeung et al., Multigrain shared memory, ACM T COMP, 18(2), 2000, pp. 154-196

Citations number

Categorie Soggetti

Computer Science & Engineering

Journal title

ACM TRANSACTIONS ON COMPUTER SYSTEMS

ISSN journal

07342071 → ACNP

Volume

Issue

Year of publication

2000

Pages

154 - 196

Database

ISI

SICI code

0734-2071(200005)18:2<154:MSM>2.0.ZU;2-T

Abstract

Parallel workstations, each comprising tens of processors based on shared m emory, promise cost-effective scalable multiprocessing. This article explor es the coupling of such small- to medium-scale shared-memory multiprocessor s through software over a local area network to synthesize larger shared-me mory systems. We call these systems Distributed Shared-memory MultiProcesso rs (DSMPs). This article introduces the design of a shared-memory system th at uses multiple granularities of sharing, called MGS, and presents a proto type implementation of MGS on the MIT Alewife multiprocessor. Multigrain sh ared memory enables the collaboration of hardware and software shared memor y, thus synthesizing a single transparent shared-memory address space acros s a cluster of multiprocessors. The system leverages the efficient support for fine-grain cache-line sharing within multiprocessor nodes as often as p ossible, and resorts to coarse-grain page-level sharing across nodes only w hen absolutely necessary. Using our prototype implementation of MGS, an in- depth study of several shared-memory applications is conducted to understan d the behavior of DSMPs. Our study is the first to comprehensively explore the DSMP design space, and to compare the performance of DSMPs against all- software and all-hardware DSMs on a single experimental platform. Keeping t he total number of processors fixed, we show that applications execute up t o 85% faster on a DSMP as compared to an all-software DSM. We also show tha t all-hardware DSMs hold a significant performance advantage over DSMPs on challenging applications, between 159% and 1014%. However, program transfor mations to improve data locality for these applications allow DSMPs to almo st match the performance of an all-hardware multiprocessor of the same size .