U. Rencuzogullari et S. Dwarkadas, Dynamic adaptation to available resources for parallel computing in an autonomous network of workstations, ACM SIGPL N, 36(7), 2001, pp. 72-81
Networks of workstations (NOWs), which are generally composed of autonomous
compute elements networked together, axe an attractive parallel computing
platform since they offer high performance at low cost. The autonomous natu
re of the environment, however, often results in inefficient utilization du
e to load imbalances caused by three primary factors: 1) unequal load (comp
ute or communication) assignment to equally-powerful compute nodes, 2) uneq
ual resources at compute nodes, and 3) multiprogramming. These load imbalan
ces result in idle waiting time on cooperating processes that need to synch
ronize or communicate data. Additional waiting time may result due to local
scheduling decisions in a multiprogrammed environment. In this paper, we p
resent a combined approach of compile-time analysis, run-time load distribu
tion, and operating system scheduler cooperation for improved utilization o
f available resources in an autonomous NOW. The techniques we propose allow
efficient resource utilization by taking into consideration all three caus
es of load imbalance in addition to locality of access in the process of lo
ad distribution. The resulting adaptive load distribution and cooperative s
cheduling system allows applications to take advantage of parallel resource
s when available by providing better performance than when the loaded resou
rces axe not used at all.