LOAD BALANCED MAPPING OF DISTRIBUTED OBJECTS TO MINIMIZE NETWORK COMMUNICATION

Citation
Ad. Stoyenko et al., LOAD BALANCED MAPPING OF DISTRIBUTED OBJECTS TO MINIMIZE NETWORK COMMUNICATION, Journal of parallel and distributed computing, 34(2), 1996, pp. 117-136
Citations number
36
Categorie Soggetti
Computer Sciences","Computer Science Theory & Methods
ISSN journal
07437315
Volume
34
Issue
2
Year of publication
1996
Pages
117 - 136
Database
ISI
SICI code
0743-7315(1996)34:2<117:LBMODO>2.0.ZU;2-I
Abstract
This paper introduces a new load balancing and communication minimizin g heuristic used in the In verse Remote Procedure Call (IRPC) system. While the paper briefly describes the IRPC system, the focus is on the new IRPC assignment heuristic. The IRPC compiler maps a distributed p rogram to a graph that represents program objects and their dependenci es (due to invocations and parameter passing) as nodes and edges, resp ectively. In the graph, the system preserves conditional and iterative flows, records network transmission and execution costs, and marks no des that have to reside at specific network sites. The graph is then p artitioned by the heuristic to derive a (sub)optimal node assignment t o network sites minimizing load balancing and network data transport. The resulting program partition is then reflected in the physical obje ct distribution, and remote and local object communication is transpar ently implemented. The compiler and run-time system use efficient impl ementation techniques such as type prediction, inlining, splitting and subprogram passing. The last of these allows remote code to be copied to local data, as an alternative to copying data to the remote site, whenever this will reduce network data transport. The IRPC graph parti tioning heuristic operates in time O(E(log d + l + log M)), where M is the number of network sites, E is the number of communication edges, and d is the maximum degree of a node; l is a parameter of the algorit hm, and can vary between 1 and N, where N is the number of communicati ng objects. This complexity is more nearly independent of M, and consi derably better in terms of E and N, than that of previously known rela ted algorithms, such as A, which employs backtracking and is potentia lly exponential, or the max-flow/min-cut class of network flow algorit hms or heuristics which tend to be at least of Omega(MN(2)E), and it c an be made (by choosing l appropriately) as efficient as even such fas t heuristics as heaviest-edge-first, minimal communication, and Kernig han-Lin. In an extensive quantitative evaluation, the heuristic has be en demonstrated to perform very well, giving on the average 75% traffi c cost reductions for over 95% of the programs when compared to random partitioning, and outperforming in cost reduction and actual executio n time the three aforementioned fast heuristics, even with a large l. Thus, to the best of our knowledge, this is the first report of a well -performing assignment heuristic that is both essentially linear in th e number of communication edges, and better than existing, established heuristics of no better complexity. (C) 1996 Academic Press, Inc.