New high-speed networks greatly encourage the use of network memory as
a cache for virtual memory and file pages, thereby reducing the need
for disk access. Because pages are the fundamental transfer and access
units in remote memory systems, page size is a key performance factor
. Recently, page sizes of modem processors have been increasing in ord
er to provide more TLB coverage and amortize disk access costs. Unfort
unately, for high-speed networks, small transfers are needed to provid
e low latency. This trend in page size is thus at odds with the use of
network memory on high-speed networks. This paper studies the use of
subpages as a means of reducing transfer size and latency in a remote-
memory environment. Using trace-driven simulation, we show how and why
subpages reduce latency and improve performance of programs using net
work memory. Our results show that memory-intensive applications execu
te up to 1.8 times faster when executing with 1K-byte subpages, when c
ompared to the same applications using full 8K-byte pages in the globa
l memory system. Those same applications using 1K-byte subpages execut
e up to 4 times faster than they would using the disk for backing stor
e. Using a prototype implementation on the DEC Alpha and AN2 network,
we demonstrate how subpages can reduce remote-memory fault time; e.g.,
our prototype is able to satisfy a fault on a 1K subpage stored in re
mote memory in 0.5 milliseconds, one third the time of a full page.