The problem of using a broker to select a subset of available information s
ervers in order to achieve a good trade-off between document retrieval effe
ctiveness and cast is addressed. Sewer selection methods which are capable
of operating in the absence of global information, and where sewers have no
knowledge of brokers, are investigated. A novel method using Lightweight P
robe queries (LWP method) is compared with several methods based on data fr
om past query processing, while Random and Optimal server rankings serve as
controls. Methods are evaluated, using TREC data and relevance judgments,
by computing ratios, both empirical and ideal, of recall and early precisio
n for the subset versus the complete set of available servers. Estimates ar
e also made of the best-possible performance of each of the methods. LWP an
d Topic Similarity methods achieved best results, each being capable of ret
rieving about 60% of the relevant documents for only one-third of the cost
of querying all servers. Subject to the applicable cost model, the LWP meth
od is likely to be preferred because it is suited to dynamic environments.
The good results obtained with a simple automatic LWP implementation were r
eplicated using different data and a larger set of query topics.