Kl. Tan et Hj. Lu, SKEW HANDLING STRATEGIES FOR PIPELINED PROCESSING OF MULTI-JOIN QUERIES IN SHARED-NOTHING SYSTEMS, Computer systems science and engineering, 10(1), 1995, pp. 3-18
Citations number
29
Categorie Soggetti
System Science","Computer Application, Chemistry & Engineering","Computer Sciences, Special Topics","Computer Science Theory & Methods
This paper looks at how to effectively exploit pipelining for multi-jo
in queries in shared-nothing systems. A promising technique that has a
ppeared in the literature uses an iterative approach to process a mult
i-join query. In each iteration, several relations are selected, and a
re joined in a pipelined fashion. However, optimization algorithms tha
t are based on this approach have traditionally assumed that the relat
ions are uniformly distributed or lowly skewed. When this assumption i
s relaxed, that is when the data is skewed, the performance of the sys
tem may degenerate drastically. We propose four skew handling techniqu
es to deal with data skew for multi-join queries. An analytical model
is presented and used to compare the performance of each technique. Ou
r results show that the nested-loops technique performs worst, while t
he hybrid technique is superior in most cases.