The performance of very long instruction word (VLIW) microprocessors d
epends on the close co-operation between the compiler and the architec
ture. To design a highperformance VLIW a testbed is required that allo
ws detailed co-evaluation of both compilation techniques and architect
ural features. The paper introduces a new VLTW testbed based on the SP
ARC instruction set architecture, which includes an aggressive schedul
ing compiler and a fast VLIW simulator. The compiler takes gcc-generat
ed optimised SPARC code as input and generates parallelised VLIW code,
targeting advanced VLIW architectures. The compiler can generate high
-performance VLIW code, especially for non-numerical integer programs.
The VLIW code is translated into a dedicated C program for fast and s
imple compiled simulation which generates detailed data for performanc
e evaluation. The authors have performed a comprehensive empirical stu
dy on the testbed for both large-resource and small-resource machines.
The result shows that as much as a geometric mean of fourfold speedup
is obtainable on nontrivial integer benchmarks without using branch p
robability when performing speculative code motion. Also analysed are
the characteristics of the useful and useless ALU operations in each c
ycle to see how the speedup is obtained. The analysis indicates that a
round half of the useful ALUs execute speculative instructions whose o
riginal paths are taken (thus being 'hit'), yet a substantial number o
f ALUs are also wasted owing to useless speculative execution or copy
execution.