Ka. Hua et al., Performance of load balancing techniques for join operations in shared-nothing database management systems, J PAR DISTR, 56(1), 1999, pp. 17-46
We investigate various load balancing approaches for hash-based join techni
ques popular in multicomputer-based shared-nothing database systems. When t
he tuples are not uniformly distributed among the hash buckets, redistribut
ion of these buckets among the processors is necessary to maintain good sys
tem performance. Two recent load balancing techniques which rely on samplin
g and incremental balancing, respectively, have been shown to be more robus
t than conventional methods. The comparison of these two approaches, howeve
r, has not been investigated. In this study, we improve these two schemes a
nd implement them along with a conventional method and a standard join tech
nique which does not do load balancing on an nCUBE/2 parallel computer to c
ompare their performance. Our experimental results indicate that the sampli
ng technique is the better approach. To further evaluate the performance of
these techniques under diverse hardware conditions, we also develop a cost
model and implement a simulator to perform sensitivity analyses with respe
ct to various hardware parameters. The simulation results show that both sa
mpling and incremental techniques provide noticeable savings over conventio
nal methods, with the sampling approach being more scalable in supporting v
ery large database systems. (C) 1999 Academic Press.