K. Dackland et B. Kagstrom, Blocked algorithms and software for reduction of a regular matrix pair to generalized schur form, ACM T MATH, 25(4), 1999, pp. 425-454
A two-stage blocked algorithm for reduction of a regular matrix pair (A, B)
to upper Hessenberg-triangular form is presented. In stage 1 (A, B) is red
uced to block upper Hessenberg-triangular form using mainly level 3 (matrix
-matrix) operations that permit data reuse in the higher levels of a memory
hierarchy. In the second stage all but one of the r subdiagonals of the bl
ock Hessenberg A-part are set to zero using Givens rotations. The algorithm
proceeds in a sequence of supersweeps, each reducing m columns. The update
s with respect to row and column rotations are organized to reference conse
cutive columns of A and B. To further improve the data locality, all rotati
ons produced in a supersweep are stored to enable a left-looking reference
pattern, i.e., all updates are delayed until they are required for the cont
inuation of the supersweep. Moreover, we present a blocked variant of the s
ingle diagonal double-shift QZ method for computing the generalized Schur f
orm of(A, B) in upper Hessenberg-triangular form. The blocking for improved
data locality is done similarly, now by restructuring the reference patter
n of the updates associated with the bulge chasing in the QZ iteration. Tim
ing results show that our new blocked variants outperform the current LAPAC
K routines, including drivers for the generalized eigenvalue problem, by a
factor 2-5 for sufficiently large problems.