Performance of load balancing techniques for join operations in shared-nothing database management systems

Citation
Ka. Hua et al., Performance of load balancing techniques for join operations in shared-nothing database management systems, J PAR DISTR, 56(1), 1999, pp. 17-46
Citations number
26
Categorie Soggetti
Computer Science & Engineering
Journal title
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
ISSN journal
07437315 → ACNP
Volume
56
Issue
1
Year of publication
1999
Pages
17 - 46
Database
ISI
SICI code
0743-7315(199901)56:1<17:POLBTF>2.0.ZU;2-W
Abstract
We investigate various load balancing approaches for hash-based join techni ques popular in multicomputer-based shared-nothing database systems. When t he tuples are not uniformly distributed among the hash buckets, redistribut ion of these buckets among the processors is necessary to maintain good sys tem performance. Two recent load balancing techniques which rely on samplin g and incremental balancing, respectively, have been shown to be more robus t than conventional methods. The comparison of these two approaches, howeve r, has not been investigated. In this study, we improve these two schemes a nd implement them along with a conventional method and a standard join tech nique which does not do load balancing on an nCUBE/2 parallel computer to c ompare their performance. Our experimental results indicate that the sampli ng technique is the better approach. To further evaluate the performance of these techniques under diverse hardware conditions, we also develop a cost model and implement a simulator to perform sensitivity analyses with respe ct to various hardware parameters. The simulation results show that both sa mpling and incremental techniques provide noticeable savings over conventio nal methods, with the sampling approach being more scalable in supporting v ery large database systems. (C) 1999 Academic Press.