This paper describes a new approach to finding performance bottlenecks
in shared-memory parallel programs and its embodiment in the Paradyn
Parallel Performance Tools running with the Blizzard fine-grain distri
buted shared memory system. This approach exploits the underlying syst
em's cache coherence protocol to detect data sharing patterns that ind
icate potential performance bottlenecks and presents performance measu
rements in a data-centric manner. As a demonstration, Paradyn helped u
s improve the performance of a new shared-memory application program b
y a factor of four.