The performance characteristics of several classes of parallel computing sy
stems are analyzed and compared using high-fedility modeling and execution-
driven simulation. Processor, bus and network models are used to construct
and simulate the architectures of symmetric multiprocessors (SMPs), cluster
s of uniprocessors, and clusters of SMPs. To demonstrate a typical use, the
performance of ten systems is evaluated using a parallel matrix-multiplica
tion algorithm. Because the performance of a parallel algorithm on an archi
tecture depends on its communication-to-communication ratio, an analysis of
communication latencies for bus transactions, cache coherence, and network
transactions is used to quantify each system's communication overhead. Whi
le low-level performances attributes are difficult to measure on experiment
al testbed systems, and are difficult to accurately represent in purely ana
lytical models, with high-fidelity simulative models they can be readily an
d accurately obtained. This level of detail allows the designer to rapidly
prototype and evaluate the performance of parallel and distributed systems.