Impact of CC-NUMA memory management policies on the application performance of multistage switching networks

Citation
Ln. Bhuyan et al., Impact of CC-NUMA memory management policies on the application performance of multistage switching networks, IEEE PARALL, 11(3), 2000, pp. 230-246
Citations number
18
Categorie Soggetti
Computer Science & Engineering
Journal title
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
ISSN journal
10459219 → ACNP
Volume
11
Issue
3
Year of publication
2000
Pages
230 - 246
Database
ISI
SICI code
1045-9219(200003)11:3<230:IOCMMP>2.0.ZU;2-6
Abstract
In this paper, the impact of memory management policies and switch design a lternatives on the application performance of cache-coherent nonuniform mem ory access (CC-NUMA) multiprocessors is studied in detail. Memory managemen t plays an important role in determining the performance of NUMA multiproce ssors by dictating the placement of data among the distributed memory modul es. We analyze memory traces of several scientific applications for three d ifferent memory management techniques, namely buddy, round-robin, and first -touch policies, and compare their memory system performance. Interconnecti on network switch designs that consider virtual channels and varying number of input buffers per switch are presented. Our performance evaluation is b ased on execution-driven simulation methodology to capture the dynamic chan ges in the network traffic during execution of the applications. It is show n that the use of cut-through switching with buffers and virtual channels c an improve the average message latency tremendously. However, the choice of memory management policy affects the amount of network traffic and the net work access pattern. Thus, we vary the memory management policy and confirm the performance benefits of improved switch designs. Results of sensitivit y studies by varying switch design parameters, cache block size, and memory page size are also presented. We find that a combination of first-touch me mory management policy and a switch design with virtual channels and increa sed buffer space can reduce the average message latency by as high as 70 pe rcent.