An important aspect of a high-speed network system is the ability to t
ransfer data directly between the network interface and application bu
ffers. Such a direct data path requires the network interface to ''kno
w'' the virtual-to-physical address translation of a user buffer, i.e.
, the physical memory location of the buffer. This paper presents an e
fficient address translation architecture, User-managed TLB (UTLB), wh
ich eliminates system calls and device interrupts from the common comm
unication path. UTLB also supports application-specific policies to pi
n and unpin application memory. We report micro-benchmark results for
an implementation on Myrinet PC clusters. A trace-driven analysis is u
sed to compare the UTLB approach with the interrupt-based approach. It
is also used to study the effects of UTLB cache size, associativity,
and prefetching. Our results show that the UTLB approach delivers robu
st performance with relatively small translation cache sizes.