Compiler-based I/O prefetching for out-of-core applications

Citation
Ad. Brown et al., Compiler-based I/O prefetching for out-of-core applications, ACM T COMP, 19(2), 2001, pp. 111-170
Citations number
43
Categorie Soggetti
Computer Science & Engineering
Journal title
ACM TRANSACTIONS ON COMPUTER SYSTEMS
ISSN journal
07342071 → ACNP
Volume
19
Issue
2
Year of publication
2001
Pages
111 - 170
Database
ISI
SICI code
0734-2071(200105)19:2<111:CIPFOA>2.0.ZU;2-V
Abstract
Current operating systems offer poor performance when a numeric application 's working set does not fit in main memory. As a result, programmers who wi sh to solve "out-of-core" problems efficiently are typically faced with the onerous task of rewriting an application to use explicit I/O operations (e .g., read/write). In this paper, we propose and evaluate a fully automatic technique which liberates the programmer from this task, provides high perf ormance, and requires only minimal changes to current operating systems. In our scheme the compiler provides the crucial information on future access patterns without burdening the programmer; the operating system supports no nbinding prefetch and release hints for managing I/O; and the operating sys tem cooperates with a run-time layer to accelerate performance by adapting to dynamic behavior and minimizing prefetch overhead. This approach maintai ns the abstraction of unlimited virtual memory for the programmer, gives th e compiler the flexibility to aggressively insert prefetches ahead of refer ences, and gives the operating system the flexibility to arbitrate between the competing resource demands of multiple applications. We implemented our compiler analysis within the SUIF compiler, and used it to target implemen tations of our run-time and OS support on both research and commercial syst ems (Hurricane and IRIX 6.5, respectively). Our experimental results show l arge performance gains for out-of-core scientific applications on both syst ems: more than 50% of the I/O stall time has been eliminated in most cases, thus translating into overall speedups of roughly twofold in many cases.