Cenju-3 is a parallel computer in which up to 256 processing elements
(PEs) are connected by a highspeed multistage interconnection network.
In designing the system, the architecture is tuned for up to a 256 pr
ocessor system. A VR4400 with 1 MB of secondary cache memory is implem
ented on a multi-chip-module to realize a compact and high-performance
PE. The multistage network is implemented very compactly. The number
of the cables is equal to the number of processors. The dedicated netw
ork interface hardware designed for the system achieves low latency an
d high throughput. This paper presents the machine architecture and it
s evaluation.