ITA
ENG

Matrices, vector spaces, and information retrieval

Authors

Berry, MW Drmac, Z Jessup, ER

Citation

Mw. Berry et al., Matrices, vector spaces, and information retrieval, SIAM REV, 41(2), 1999, pp. 335-362

Citations number

Categorie Soggetti

Mathematics

Journal title

SIAM REVIEW

ISSN journal

00361445 → ACNP

Volume

Issue

Year of publication

1999

Pages

335 - 362

Database

ISI

SICI code

0036-1445(199906)41:2<335:MVSAIR>2.0.ZU;2-4

Abstract

The evolution of digital libraries and the Internet has dramatically transf ormed the processing, storage, and retrieval of information. Efforts to dig itize text, images, video, and audio now consume a substantial portion of b oth academic and industrial activity. Even when there is no shortage of tex tual materials on a particular topic, procedures for indexing or extracting the knowledge or conceptual information contained in them can be lacking. Recently developed information retrieval technologies are based on the conc ept of a vector space. Data are modeled as a matrix, and a user's query of the database is represented as a vector. Relevant documents in the database are then identified via simple vector operations. Orthogonal factorization s of the matrix provide mechanisms for handling uncertainty in the database itself. The purpose of this paper is to show how such fundamental mathemat ical concepts from linear algebra can be used to manage and index large tex t collections.