Load balancing increases the efficient use of existing resources for parall
el and distributed applications, At a coarse level of granularity, advances
in runtime systems for parallel programs have been proposed in order to co
ntrol available resources as efficiently as possible by utilizing idle reso
urces and using task migration. Simultaneously, at a finer granularity leve
l, advances in algorithmic strategies for dynamically balancing computation
al loads by data redistribution have been proposed in order to respond to v
ariations in processor performance during the execution of a given parallel
application Combining strategies from each level of granularity can result
in a system which delivers advantages of both. The resulting integration i
s systemic in nature and transfers the responsibility of efficient resource
utilization from the application programmer to the runtime system. This pa
per presents the design and implementation of a system that combines an alg
orithmic fine-grained data parallel load balancing strategy with a systemic
coarse-grained task-parallel load balancing strategy, and reports on recen
t experimental results of running a computationally intensive scientific ap
plication under this integrated system. The experimental results indicate t
hat a distributed runtime environment which combines both task and data mig
ration can provide performance advantages with little overhead. It also pre
sents proposals for performance enhancements of the implementation, as well
as future explorations for effective resource management. Copyright (C) 20
01 John Wiley & Sons, Ltd.