The authors discuss methods for expressing and tuning the performance of pa
rallel programs, using two programming models in the same program: distribu
ted and shared memory. Such methods are important for anyone who uses these
large machines for parallel programs as well as for those who study combin
ations of the two programming models.