The scatter of documents over databases in different subject domains: How many databases are needed?

Citation
Ww. Hood et Cs. Wilson, The scatter of documents over databases in different subject domains: How many databases are needed?, J AM SOC IN, 52(14), 2001, pp. 1242-1254
Citations number
22
Categorie Soggetti
Library & Information Science
Journal title
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY
ISSN journal
15322882 → ACNP
Volume
52
Issue
14
Year of publication
2001
Pages
1242 - 1254
Database
ISI
SICI code
1532-2882(200112)52:14<1242:TSODOD>2.0.ZU;2-V
Abstract
The distribution of bibliographic records in on-line bibliographic database s is examined using 14 different search topics. These topics were searched using the DIALOG database host, and using as many suitable databases as pos sible. The presence of duplicate records in the searches was taken into con sideration in the analysis, and the problem with lexical ambiguity in at le ast one search topic is discussed. The study answers questions such as how many databases are needed in a multifile search for particular topics, and what coverage will be achieved using a certain number of databases. The dis tribution of the percentages of records retrieved over a number of database s for 13 of the 14 search topics roughly fell into three groups: (1) high c oncentration of records in one database with about 80% coverage in five to eight databases; (2) moderate concentration in one database with about 80% coverage in seven to 10 databases; and (3) low concentration in one databas e with about 80% coverage in 16 to 19 databases. The study does conform wit h earlier results, but shows that the number of databases needed for search es with varying complexities of search strategies, is much more topic depen dent than previous studies would indicate.