A trend in high performance computers that is becoming increasingly popular
is the use of symmetric multiprocessing (SMP) rather than the older paradi
gm of MPP. MPI codes that ran and scaled well on MPP machines can often be
run on an SR;IP machine using the vendor's version of MPI. However, this ap
proach may not make optimal use of the (expensive) SMP hardware. More signi
ficantly, there are machines like Blue Horizon, an IBM SP with 8-way. SMP n
odes at the San Diego Supercomputer Center that carl only support 4 MPI pro
cesses per node (with the current switch). On such a machine it is imperati
ve to be able to use OpenMP parallelism on the node, and MPI between nodes.
We describe the challenges of converting MILC MPI code to using a second l
evel of OpenMP parallelism, and benchmarks on IBM and Sun computers.