This article presents a customizable architecture for software agents
that capture and access information in large, heterogeneous, distribut
ed electronic repositories. The key idea is to exploit underlying stru
cture at various levels of granularity to build high-level indices wit
h task-specific interpretations. Information agents construct such ind
ices and are configured as a network of reusable modules called struct
ure detectors and segmenters. We illustrate our architecture with the
design and implementation of smart information filters in two contexts
: retrieving stock market data from Internet newsgroups and retrieving
technical reports from Internet FTP sites.