The Cydra 5 exploited both fine-grained and coarse-grained parallelism
for a broad class of compute-intensive tasks. It utilized a unique ar
chitecture designed for efficient compilation and execution of inner l
oops, as well as parallel execution of nonloop code. We discuss the ar
chitecture and implementation of the numeric processor and its associa
ted high-bandwidth memory system as well as attributes of the overall
system.