In the framework of fully permutable loops, tiling is a compiler technique
(also known as 'loop blocking') that has been extensively studied as a sour
ce-to-source program transformation. Little work has been devoted to the ma
pping and scheduling of the tiles on to physical parallel processors. We pr
esent several new results in the context of limited computational resources
and assuming communication-computation overlap. In particular, under some
reasonable assumptions, we derive the optimal mapping and scheduling of til
es to physical processors. Copyright (C) 1999 John Wiley & Sons, Ltd.