The FLAPW (full-potential linearized-augmented plane-wave) method is one of
the most accurate first-principles methods for determining structural, ele
ctronic and magnetic properties of crystals and surfaces. Until the present
work, the FLAPW method has been limited to systems of less than about a hu
ndred atoms due to the lack of an efficient parallel implementation to expl
oit the power and memory of parallel computers. In this work, we present an
efficient parallelization of the method by division among the processors o
f the plane-wave components for each state. The code is also optimized for
RISC (reduced instruction set computer) architectures, such as those found
on most parallel computers, making full use of BLAS (basic linear algebra s
ubprograms) wherever possible. Scaling results are presented for systems of
up to 686 silicon atoms and 343 palladium atoms per unit cell, running on
up to 512 processors on a GRAY T3E parallel supercomputer. (C) 2000 Elsevie
r Science B.V. All rights reserved.