AN EFFECTIVE GARBAGE COLLECTION STRATEGY FOR PARALLEL PROGRAMMING-LANGUAGES ON LARGE-SCALE DISTRIBUTED-MEMORY MACHINES

Citation
K. Taura et A. Yonezawa, AN EFFECTIVE GARBAGE COLLECTION STRATEGY FOR PARALLEL PROGRAMMING-LANGUAGES ON LARGE-SCALE DISTRIBUTED-MEMORY MACHINES, ACM SIGPLAN NOTICES, 32(7), 1997, pp. 264-275
Citations number
35
Categorie Soggetti
Computer Sciences","Computer Science Software Graphycs Programming
Journal title
Volume
32
Issue
7
Year of publication
1997
Pages
264 - 275
Database
ISI
SICI code
Abstract
This paper describes the design and implementation of a garbage collec tion scheme on large-scale distributed-memory computers and reports va rious experimental results. The collector is based on the conservative GC library by Boehm & Weiser. Each processor traces local pointers us ing the GC library while traversing remote pointers by exchanging ''ma rk messages'' between processors. It exhibits a promising performance- in the most space-intensive settings we tested, the total collection o verhead ranges from 5% up to 15% of the application running time (excl uding idle time). We not only examine basic performance figures such a s the total overhead or latency of a global collection, but also demon strate how local collection scheduling strategies affect application p erformance. In our collector, a local collection is scheduled either i ndependently or synchronously. Experimental results show that the bene fit of independent local collections has been overstated in the litera ture. Independent local collections slowed down application performanc e to 40%, by increasing the average communication latency. Synchronize d local collections exhibit much more robust performance characteristi cs than independent local collections and the overhead for global sync hronization is not significant. Furthermore, we show that an adaptive collection scheduler can select the appropriate local collection strat egy based on the application's behavior. The collector has been used i n a concurrent object-oriented language ABCL/f and the performance is measured on a large-scale parallel computer (256 processors) using fou r non-trivial applications written in ABCL/f.