A three-dimensional wavefront array for matrix product with minimal bl
ock pipelining period of 1 is introduced and compared to existing syst
olic array architectures for matrix product. An optimal processor-time
product of n(3) with cycles defined computationally by two operations
is obtained when successive problem instances are considered. The 3-D
architecture; is extensible and scalable, is cycle invariant (all res
pects), is node invariant (all respects), has minimal node complexity
of one multiply and one addition per cycle, has unidirectional and loc
al data forwarding in three dimensions, has 100% utilization of proces
sing elements, and has a cycle-invariant one-to-one correspondence bet
ween input/output ports and input/output matrix elements.