M. Gusev et Dj. Evans, Elimination of the computational broadcast in systolic arrays: An application to the QR decomposition algorithm, INT J COM M, 74(4), 2000, pp. 449-469
Two types of broadcast in algorithms are determined: (1) a data broadcast,
where one data value is used for more than one computation and (2) a comput
ational broadcast where one variable is computed in more then one computati
on. Both types of broadcast are preferred to be eliminated when a processor
array implementation is desired by using VLSI technology.
When the algorithm computes only one variable value for each index vector t
hen the computational broadcast can be eliminated in a straight forward man
ner by introducing counter values resulting in a single assignment code. Ho
wever in cases when the algorithm computes two or more variable values that
are specified by a different computational broadcast, has not been conside
red. As far as is known it has been solved by deriving localized algorithms
in single assignment code heuristically.
In this paper we define this problem in terms of a system of affine recurre
nce equations and analyze the data dependencies introduced. Then we show a
synthesis procedure that eliminates the computational broadcast and a few e
xamples of implementation are shown. The QR decomposition algorithm is also
presented in a localized single assignment code by using the proposed meth
od and several different parallel implementations are discussed.