Text mining using database tomography and bibliometrics: A review

Citation
Rn. Kostoff et al., Text mining using database tomography and bibliometrics: A review, TECHNOL FOR, 68(3), 2001, pp. 223-253
Citations number
27
Categorie Soggetti
EnvirnmentalStudies Geografy & Development
Journal title
TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE
ISSN journal
00401625 → ACNP
Volume
68
Issue
3
Year of publication
2001
Pages
223 - 253
Database
ISI
SICI code
0040-1625(200111)68:3<223:TMUDTA>2.0.ZU;2-S
Abstract
Database tomography (DT) is a textual database analysis system consisting o f two major components: (1) algorithms for extracting multiword phrase freq uencies and phrase proximities (physical closeness of the multiword technic al phrases) from any type of large textual database, to augment (2) interpr etative capabilities of the expert human analyst. DT has been used to deriv e technical intelligence from a variety of textual database sources, most r ecently the published technical literature as exemplified by the Science Ci tation Index (SCI) and the Engineering Compendex (EC). Phrase frequency ana lysis (the occurrence frequency of multiword technical phrases) provides th e pervasive technical themes of the topical databases of interest, and phra se proximity analysis provides the relationships among the pervasive techni cal themes. In the structured published literature databases, bibliometric analysis of the database records supplements the DT results by identifying the recent most prolific topical area authors; the journals that contain nu merous topical area papers; the institutions that produce numerous topical area papers; the keywords specified most frequently by the topical area aut hors; the authors whose works are cited most frequently in the topical area papers; and the particular papers and journals cited most frequently in th e topical area papers. This review paper summarizes: (1) the theory and bac kground development of DT; (2) past published and unpublished literature st udy results; (3) present application activities; (4) potential expansion to new DT applications. In addition, application of DT to technology forecast ing is addressed. (C) 2001 Elsevier Science Inc. All rights reserved.