Several useful compiler and program transformation techniques for the super
threaded architectures((1)) are presented in this paper. The superthreaded
architecture adopts a thread pipelining execution model to facilitate runti
me data dependence checking between threads, and to maximize thread overlap
to enhance concurrency. In this paper, we present some important program t
ransformation techniques to facilitate concurrent execution among threads,
and to manage critical system resources such as the memory buffers effectiv
ely. We evaluate the effectiveness of those program transformation techniqu
es by applying them manually on several benchmark programs, and using a tra
ce-driven, cycle-by-cycle superthreaded processor simulator. The simulation
results show that a superthreaded processor can achieve promising speedup
for most of the benchmark programs.