This paper presents a block variant of the GMRES method for solving ge
neral unsymmetric linear systems. This algorithm generates a transform
ed Hessenberg matrix by solely using block matrix operations and block
data communications. It is shown that this algorithm with block size
s, denoted by BVGMRES(s, m), is theoretically equivalent to the GMRES(
s, m) method. The numerical results demonstrate that this algorithm ca
n be more efficient than the standard GMRES method on a cache based si
ngle CPU computer with optimized BLAS kernels, Furthermore, the gain i
n efficiency is more significant on MPPs due to both efficient block o
perations and efficient block data communications. Preliminary numeric
al results on some real-world problems also show that this algorithm m
ay be stable up to some reasonable block size.