We consider the algebraic multilevel iteration (AMLI) for the solution of s
ystems of linear equations as they arise from a finite-difference discretiz
ation on a rectangular grid. Key operation is the matrix-vector product, wh
ich can efficiently be executed on vector and parallel-vector computer arch
itectures if the nonzero entries of the matrix are concentrated in a few di
agonals. In order to maintain this structure for all matrices on all levels
coarsening in alternating directions is used. In some cases it is necessar
y to introduce additional dummy grid hyperplanes. The data movements in the
restriction and prolongation are crucial, as they produce massive memory c
onflicts on vector architectures. By using a simple performance model the b
est of the possible vectorization strategies is automatically selected at r
untime. Examples show that on a Fujitsu VPP300 the presented implementation
of AMLI reaches about 85% of the useful performance, and scalability with
respect to computing time can be achieved.