Any parallel program has abstractions that are shared by the program's
multiple processes, including data structures containing shared data,
code implementing operations like global sums or minima, type instanc
es used for process synchronization or communication. Such shared abst
ractions can considerably affect the performance of parallel programs,
on both distributed and shared memory multiprocessors. As a result, t
heir implementation must be efficient, and such efficiency should be a
chieved without unduly compromising program portability and maintainab
ility. Unfortunately, efficiency and portability can be at cross-purpo
ses, since high performance typically requires changes in the represen
tation of shared abstractions across different parallel machines. The
primary contribution of the DSA library presented and evaluated in thi
s paper is its representation of shared abstractions as objects that m
ay be internally distributed across different nodes of a parallel mach
ine. Such distributed shared abstractions (DSA) are encapsulated so th
at their implementations are easily changed while maintaining program
portability across parallel architectures ranging from small-scale mul
tiprocessors, to medium-scale shared and distributed memory machines,
and potentially, to networks of computer workstations. The principal r
esults presented in this paper are 1) a demonstration that the fragmen
tation of object state across different nodes of a multiprocessor mach
ine can significantly improve program performance, and 2) that such ob
ject fragmentation can be achieved without compromising portability by
changing object interfaces. These results are demonstrated using impl
ementations of the DSA library on several medium-scale multiprocessors
, including the BBN Butterfly, Kendall Square Research, and SGI shared
memory multiprocessors. The DSA library's evaluation uses synthetic w
orkloads and a parallel implementation of a branch-and-bound algorithm
for solving the Traveling Salesperson Problem (TSP).