The I/O bottleneck in parallel computer systems has recently begun rec
eiving increasing interest. Most attention has focused on improving th
e performance of I/O devices using fairly low-level parallelism in tec
hniques such as disk striping and interleaving. Widely applicable solu
tions, however, will require an integrated approach which addresses th
e problem at multiple system levels, including applications, systems s
oftware, and architecture. We propose that within the context of such
an integrated approach, scheduling parallel 110 operations will become
increasingly attractive and can potentially provide substantial perfo
rmance benefits. We describe a simple I/O scheduling problem and prese
nt approximate algorithms for its solution. The costs of using these a
lgorithms in terms of execution time, and the benefits in terms of red
uced time to complete a batch of I/O operations, are compared with the
situations in which no scheduling is used, and in which an optimal sc
heduling algorithm is used. The comparison is performed both theoretic
ally and experimentally. We have found that, in exchange for a small e
xecution time overhead, the approximate scheduling algorithms can prov
ide substantial improvements in 110 completion times.