Vision is a challenging application for high-performance computing (HP
C). Many vision tasks have stringent latency and throughput requiremen
ts. Further, the vision process has a heterogeneous computational prof
ile. Low-level vision consists of structured computations, with regula
r data dependencies. The subsequent higher level operations consist of
symbolic computations with irregular data dependencies. Over the year
s, many approaches to high-speed vision have been pursued. VLSI hardwa
re solutions such as ASIC's and digital signal processors (DSP's) have
provided good processing speeds on structured low-level vision tasks.
Special purpose systems for vision have also been designed Currently,
there is growing interest in using general purpose parallel systems f
or vision problems. These systems offer advantages of higher performan
ce, software programmability, generality and architectural flexibility
over the earlier approaches. The choice of low-cost commercial-off-th
e-shelf (COTS) components as building blocks for these systems lead to
easy upgradability and increased system life. The main focus of the p
aper is on effectively using the COTS-based general purpose parallel c
omputing platforms to realize high-speed implementations of vision tas
ks. Due to the successful use of the COTS-based systems in a variety o
f high performance applications, it is attractive to consider their us
e for vision applications as well. However, the irregular data depende
ncies in vision tasks lead to large communication overheads in the HPC
systems. At the University of Southern California, our research effor
ts have been directed toward designing scalable parallel algorithms fo
r vision tasks on the HPC systems. In our approach, we use the message
passing programming model To develop portable code. Our algorithms ar
e specified using C and MPI. In this paper, we summarize our efforts,
and illustrate our approach using several example vision tasks. To fac
ilitate the analysis and development of scalable algorithms, a realist
ic computational model of the parallel system must be used. Several su
ch models have been proposed in the literature. We use the General-pur
pose Distributed Memory (GDM) model which is a simple but realistic mo
del of state-of-the-art parallel machines. Using the GDM model, generi
c algorithmic techniques such as data remapping, overlapping of commun
ication with computation, message packing, asynchronous execution and
communication scheduling are developed. Using these techniques, we hav
e developed scalable algorithms for many vision tasks. For instance, a
scalable algorithm for linear approximation has been developed using
the asynchronous execution technique. Using this algorithm linear feat
ure extraction can be performed in 0.065 s on a 64 node SP-2 for a 512
x 512 image. A serial implementation takes 3.45 s for the same task.
Similarly, the communication scheduling and decomposition techniques l
ead to a scalable algorithm for the line grouping task. We believe tha
t such an algorithmic approach can result in the development of scalab
le and portable solutions for vision tasks.