Microprocessor-based multiprocessors offer true parallelism at moderat
e hardware cost. Although such hardware building blocks are now availa
ble at many sites, the basic problem is still how to program such syst
ems. We report about an integrated programming environment for the M3
multiprocessor, which has been built at ETH Zurich. Our tools support
the software development cycle of a parallel program, that is the prog
ramming, configuration, and debugging/performance measurement phases.
Programmer support for performance analysis has been a primary motivat
ion for the system. We identify the sources of performance loss and de
scribe how this information is gathered and analyzed. As a case study,
we use a fast maze router algorithm and follow the usage of the diffe
rent tools. Finally, we compare the M3 environment with other state-of
-the-art projects.