Different tasks in image processing exhibit different computational require
ments that should be considered with respect to the architecture. This is p
articularly critical in parallel machines where many parallelization techni
ques, as data partitioning and mapping on processors, use of shared memory
space, exploitation of pipelining with pre-fetching affect dramatically the
performance with a strong relation with algorithm and architectural parame
ters.
The paper defines computational models for tightly-coupled multiprocessors
with crossbar architecture, both for data-parallel local algorithms and for
global algorithms such as spatial transformations. To solve the intrinsic
memory limitations of low-cost, highly integrated systems, the paper propos
es to extend the classical block processing model by analytically modeling
also the case of multiple processing stages.
The models have been compared in detail and have been efficiently adopted f
or optimizing performance in block processing on crossbar multiprocessors f
or low-level computer vision applications.