With the profusion of text databases on the Internet, it is becoming i
ncreasingly hard to find the most useful databases for a given query.
To attack this problem, several existing and proposed systems employ b
rokers to direct user queries, using a local database of summary infor
mation about the available databases. This summary information must ef
fectively distinguish relevant databases and must be compact while all
owing efficient access. We offer evidence that one broker, GlOSS, can
be effective at locating databases of interest even in a system of hun
dreds of databases and can examine the performance of accessing the Gl
OSS summaries for two promising storage methods: the grid file and par
titioned hashing. We show that both methods can be tuned to provide go
od performance for a particular workload (within a broad range of work
loads), and we discuss the tradeoffs between the two data structures.
As a side effect of our work, we show that grid files are more broadly
applicable than previously thought; in particular, we show that by va
rying the policies used to construct the grid file we can provide good
performance for a wide range of workloads even when storing highly sk
ewed data.