Various projects and products have been built using off-the-shelf field-pro
grammable gate arrays (FPGAs) as computation accelerators for specific task
s. Such systems typically connect one or more FPGAs to the host computer bi
a an I/O bus. Some have shown remarkable speedups, albeit limited to specif
ic application domains.
Many factors limit the general usefulness of such systems. Long reconfigura
tion times prevent acceleration of applications that spread their time over
many different tasks. Low-bandwidth paths for data transfer limit the usef
ulness of such systems to tasks that have a high compute-to-memory-bandwidt
h ratio. In addition, standard FPGA tools require hardware design expertise
beyond the knowledge of most programmers.
To help investigate the viability of connected FPGA systems, the authors de
signed their own architecture called Garp and experimented with running app
lications on it. They are also investigating whether Garp's design enables
automatic, fast, effective compilation across a broad range of applications
. They present their results in this article.