Ta. Brooks, ORTHOGRAPHY AS A FUNDAMENTAL IMPEDIMENT TO ONLINE INFORMATION-RETRIEVAL, Journal of the American Society for Information Science, 49(8), 1998, pp. 731-741
Citations number
46
Categorie Soggetti
Information Science & Library Science","Computer Science Information Systems","Computer Science Information Systems
Orthography is the linguistic study of written language: Elements of t
ext such as letters, punctuation marks, and spelling. Information retr
ieval systems operate in the orthographic realm matching some text str
ings (i.e., index entries) from documents with other text strings (i.e
., query terms) from patrons. During the early history of information
retrieval, it has been convenient to assume the rationality and unifor
mity of orthography in order to concentrate effort building informatio
n retrieval systems. Fundamental orthographic problems have persisted
into modern information retrieval systems, however, where white-space
normalization and the arbitrary treatment of punctuation have exacerba
ted the orthographic impediment to information retrieval.