A major concern with high-performance general-purpose workstations is
to speed up the execution of commands, uniprocess applications, and mu
ltiprocess applications with coarse-to medium-grain parallelism, To th
at end, a simple extension of a uniprocessor machine such as a shared-
bus, shared-memory architecture can be employed, Both kinds of machine
s generally use the same OS model, and die same application can execut
e on these machines without recoding. However, an intrinsic limitation
of the shared-bus architecture is the low number of processors that c
an be connected to the shared bus. When this number exceeds a critical
value, the system's global performance drops drastically because of b
us saturation, When two or more processors store a copy of the same me
mory block in their respective caches and one of them performs a write
operation on a location in that block, a set of bus actions is necess
ary to guarantee that every subsequent read operation by any processor
can get the up-to-date value of the modified location. Typically, res
earchers use simulation to investigate how to improve the performance
of such machines. In particular, trace-driven simulation offers a good
trade-off between speed, accuracy, and flexibility. A key point of th
is methodology is to find traces that both represent typical operating
conditions and include all information potentially needed fur an accu
rate simulation of the system, The authors have developed a methodolog
y and a set of tools (called Trace Factory) to generate traces for the
performance evaluation of shared-bus, shared-memory multiprocessor sy
stems. Trace Factory is particularly useful for evaluating a multiproc
essor architecture's performance related to different workloads and to
most of the influencing activities of the operating system. The desig
ner can evaluate and tune architectural solutions for coherence protoc
ol, cache structure, bus, and memory.