The dramatic growth of the Internet has created a new problem for users: lo
cation of the relevant sources of documents. This article presents a framew
ork for (and experimentally analyzes a solution to) this problem, which we
call the text-source discovery problem. Our approach consists of two phases
. First, each text source exports its contents to a centralized service. Se
cond, users present queries to the service, which returns an ordered list o
f promising text sources. This article describes GlOSS, Glossary of Servers
Server, with two versions: bGlOSS, which provides a Boolean query retrieva
l model, and vGlOSS, which provides a vector-space retrieval model. We also
present hGlOSS, which provides a decentralized version of the system. We e
xtensively describe the methodology for measuring the retrieval effectivene
ss of these systems and provide experimental evidence, based on actual data
, that all three systems are highly effective in determining promising text
sources for a given query.