Multimedia algorithms deal with enormous amounts of data transfers and stor
age, resulting in huge bandwidth requirements at the off-chip memory and sy
stem bus level. As a result the related energy consumption becomes critical
. Even for execution time the bottleneck can shift from the CPU to the exte
rnal bus load. This paper demonstrates a systematic software approach to re
duce this system bus load. It consists of source-to-source code transformat
ions, that have to be applied before the conventional ILP compilation. To i
llustrate this we use a cavity detection algorithm for medical imaging, tha
t is mapped on an Intel Pentium (R) II processor.