In this paper we propose an improved algorithm for the parallel LU dec
omposition of an (m + 1)-banded upper Hessenberg matrix on a shared me
mory multi-processor, which requires O(2nm2/p) parallel operations, wh
ere n is the dimension of the matrix and p is the number of processors
. We show that for the special case of tridiagonal matrices this algor
ithms has a lower operation count than those in the literature and yie
lds the best existing algorithm for the solution of tridiagonal system
s of equations.