Ls. Shen et al., A PARALLEL IMAGE-RENDERING ALGORITHM AND ARCHITECTURE BASED ON RAY-TRACING AND RADIOSITY SHADING, Computers & graphics, 19(2), 1995, pp. 281-296
Citations number
25
Categorie Soggetti
Computer Sciences, Special Topics","Computer Science Software Graphycs Programming
In this paper, we explore ways to improving the performance of a ray-c
asting based approach for visualizing artificial scenes with photo rea
lism on the screen of a workstation. We aim at developing a parallel i
mage-rendering algorithm and architecture based on the so-called two-p
ass approach[1-4]. This approach is normally demanding orders-of-magni
tude more processing power for a single processor if we wish to make a
state-of-art image in real-time or even interactive time. Several att
empts have been made to boost processing power by using parallel archi
tectures, but they still suffer from high overheads due to latency and
synchronization. In this paper, we argue that large speedups and low
overheads can only be attained through combined algorithm and architec
ture design. By attempting this combined effort, we come up with a goo
d algorithm-architecture pair, namely, the shelling technique[5-7] and
a pipelined parallel architecture[8], in which a parameterized space
partitioning on the one hand finds its counterpart in a scalable netwo
rk of clusters on the other hand. The target system which is made of a
host computer and a scalable network of clusters has been completely
modelled by using a mixed-level simulator called the Block Oriented Ne
twork Simulator (BONeS(R)), and we have evaluated its performance for
a set of practical scenes. Promising results have been observed, inclu
ding the following: 1. The performance of the shelling technique is a
weak function of the scene complexity. The computational complexity of
the shelling technique is k x R (k is about 2-5) as compared to N x R
(N is the total number of patches) of the naive algorithm, where R is
the total number of intersection-computation rays. 2. A reasonable sp
eedup has been observed up to 8 clusters. The limiting factors in spee
dup are workload imbalancing, the long latencies for global memory req
uests and the limited bandwidth provided by the system and local buses
. To achieve a higher scalability of the system, further improvement i
n the front-end system together with the use of a dynamic workload bal
ancing scheme would be necessary. 3. The performance of software inter
section computation on HP720 is about 0.2 M/sec. The radiosity engine
provides two-orders-of-magnitude more processing power than this softw
are approach per cluster.