We describe the design and implementation of a parallel QR decompositi
on algorithm for a large sparse matrix A. The algorithm is based on th
e multifrontal approach and makes use of Householder transformations.
The tasks are distributed among processors according to an assembly tr
ee which is built from the symbolic factorization of the matrix A(T)A.
We first address uniprocessor issues and then discuss the multiproces
sor implementation of the method. We consider the parallelization of b
oth the factorization phase and the solve phase. We use relaxation of
the sparsity structure of both the original matrix and the frontal mat
rices to improve the performance. We show that, in this case, the use
of Level 3 BLAS can lead to very significant gains in performance. We
use the eight processor Alliant FX/80 at CERFACS to illustrate our dis
cussion.