ORTHOGRAPHY AS A FUNDAMENTAL IMPEDIMENT TO ONLINE INFORMATION-RETRIEVAL

Authors
Citation
Ta. Brooks, ORTHOGRAPHY AS A FUNDAMENTAL IMPEDIMENT TO ONLINE INFORMATION-RETRIEVAL, Journal of the American Society for Information Science, 49(8), 1998, pp. 731-741
Citations number
46
Categorie Soggetti
Information Science & Library Science","Computer Science Information Systems","Computer Science Information Systems
ISSN journal
00028231
Volume
49
Issue
8
Year of publication
1998
Pages
731 - 741
Database
ISI
SICI code
0002-8231(1998)49:8<731:OAAFIT>2.0.ZU;2-Z
Abstract
Orthography is the linguistic study of written language: Elements of t ext such as letters, punctuation marks, and spelling. Information retr ieval systems operate in the orthographic realm matching some text str ings (i.e., index entries) from documents with other text strings (i.e ., query terms) from patrons. During the early history of information retrieval, it has been convenient to assume the rationality and unifor mity of orthography in order to concentrate effort building informatio n retrieval systems. Fundamental orthographic problems have persisted into modern information retrieval systems, however, where white-space normalization and the arbitrary treatment of punctuation have exacerba ted the orthographic impediment to information retrieval.