SIMD arrays are likely to become increasingly important as coprocessors in
domain specific systems as architects continue to leverage RAM technology i
n their design. The problem this work addresses is the efficient evaluation
of SIMD arrays with respect to complex applications while accounting for o
perating frequency and chip area. The underlying issues include the size of
the architecture space. the lack of portability of the test programs, and
the inherent complexity of simulating up to hundreds of thousands of proces
sing elements. The overall method we use is to combine architecture level a
nd Electronic Design Automation (EDA) level modeling by using an EDA-based
tool to calibrate architectural simulations. The resulting system retains m
uch of the high throughput of the architecture level simulator but it also
has accuracy similar to that of an early pass EDA synthesis and circuit sim
ulation. The particular problem of computational cost of the architectural
level simulation is addressed with a novel approach to trace-based simulati
on (we call it trace compilation), which we find to be one to two orders of
magnitude faster than instruction level simulation while still retaining m
uch of the accuracy of the model. Furthermore, traces must be generated for
only a small fraction of the possible parameter combinations. Using trace
compilation also addresses program portability by allowing the user to code
in a single data parallel language with a single compiler, regardless of t
he target architecture. We have used our system to evaluate thousands of po
tential SIMD array designs with respect to real applications and present so
me Sample results. (C) 2000 Academic Press.