Multithreaded architectures have been proposed for future multiprocess
or systems. However; some open issues remain. Can multithreading be su
pported in a multiprocessor so that it can tolerate synchronization an
d communication latencies, with little intrusion on the performance of
sequentially-executed code? How much does such support contribute to
scalable performance when communication and synchronization demands ar
e high? In this paper, we describe the design of EARTH, an architectur
e which addresses these issues. Each processor in EARTH has an off-the
-shelf Execution Unit (EU) for executing threads, and an ASIC Synchron
ization Unit (SU) supporting dataflow-like thread synchronizations, sc
heduling, and remote requests. In preparation for an implementation of
the SU, we have emulated a basic EARTH model on MANNA 2.0, an existin
g multiprocessor whose hardware configuration closely matches EARTH. T
his EARTH-MANNA testbed is fully functional, enabling us to experiment
with large benchmarks with impressive speed. With this platform, we d
emonstrate that multithreading support can be efficiently implemented
(with little emulation overhead) in a multiprocessor without a major i
mpact on uniprocessor performance. Also, we measure how much basic mul
tithreading support can help in tolerating increasing communication/sy
nchronization demands.