Current operating systems offer poor performance when a numeric application
's working set does not fit in main memory. As a result, programmers who wi
sh to solve "out-of-core" problems efficiently are typically faced with the
onerous task of rewriting an application to use explicit I/O operations (e
.g., read/write). In this paper, we propose and evaluate a fully automatic
technique which liberates the programmer from this task, provides high perf
ormance, and requires only minimal changes to current operating systems. In
our scheme the compiler provides the crucial information on future access
patterns without burdening the programmer; the operating system supports no
nbinding prefetch and release hints for managing I/O; and the operating sys
tem cooperates with a run-time layer to accelerate performance by adapting
to dynamic behavior and minimizing prefetch overhead. This approach maintai
ns the abstraction of unlimited virtual memory for the programmer, gives th
e compiler the flexibility to aggressively insert prefetches ahead of refer
ences, and gives the operating system the flexibility to arbitrate between
the competing resource demands of multiple applications. We implemented our
compiler analysis within the SUIF compiler, and used it to target implemen
tations of our run-time and OS support on both research and commercial syst
ems (Hurricane and IRIX 6.5, respectively). Our experimental results show l
arge performance gains for out-of-core scientific applications on both syst
ems: more than 50% of the I/O stall time has been eliminated in most cases,
thus translating into overall speedups of roughly twofold in many cases.