Scheduling parallel programs optimally on multiple processors is diffi
cult, partly because of interactions between applications, system soft
ware, and hardware having unexpected effects on performance. These int
eractions are hard to quantify and difficult to model. A convenient an
d effective means of quickly examining the behavior of such systems ca
n make the evaluation and refinement of scheduling paradigms easier. T
he authors have used a program-visualization tool called PV to help th
em develop and tune runtime systems for scheduling nested parallel loo
ps on shared- and distributed-memory machines. In a series of experime
nts, PV gave feedback concerning the effectiveness of alternative algo
rithms and parameters. A prevalent and striking revelation of the visu
alizations was that, because of systemic effects, parallel-loop iterat
ions exhibit execution-time variance, even when there is no algorithmi
c variance. This suggests that dynamic scheduling might be necessary t
o effectively use processors.