H. Kutluca et al., Image-space decomposition algorithms for sort-first parallel volume rendering of unstructured grids, J SUPERCOMP, 15(1), 2000, pp. 51-93
Twelve adaptive image-space decomposition algorithms are presented for sort
-first parallel direct volume rendering (DVR) of unstructured grids on dist
ributed-memory architectures. The algorithms are presented under a novel ta
xonomy based on the dimension of the screen decomposition, the dimension of
the workload arrays used in the decomposition, and the scheme used for wor
kload-array creation and querying the workload of a: region. For the 2D dec
omposition schemes using 2D workload arrays, a novel scheme is proposed to
query the exact number of screen-space bounding boxes of the primitives in
a screen region in constant time. A probe-based chains-on-chains partitioni
ng algorithm is exploited for load balancing in optimal 1D decomposition an
d iterative 2D rectilinear decomposition (RD). A new probe-based optimal 2D
jagged decomposition (OJD) is proposed which is much faster than the dynam
ic-programming based OJD scheme proposed in the literature. The summed-area
table is successfully exploited to query the workload of a rectangular reg
ion in constant time in both OJD and RD schemes for the subdivision of gene
ral 2D workload arrays. Two orthogonal recursive bisection (ORB) variants a
re adapted to relax the straight-line division restriction in conventional
ORE through using the medians-of-medians approach on regular mesh and quadt
ree superimposed on the screen. Two approaches based on the Hilbert space-f
illing curve and graph-partitioning are also proposed. An efficient primiti
ve classification scheme is proposed for redistribution in 1D, and 2D recti
linear and jagged decompositions. The performance comparison of the decompo
sition algorithms is modeled by establishing appropriate quality measures f
or load-balancing, amount of primitive replication and parallel execution t
ime. The experimental results on a Parsytec CC system using a set of benchm
ark volumetric datasets verify the validity of the proposed performance mod
els. The performance evaluation of the decomposition algorithms is also car
ried out through the sort-first parallelization of an efficient DVR algorit
hm.