TELLTALE - EXPERIMENTS IN A DYNAMIC HYPERTEXT ENVIRONMENT FOR DEGRADED AND MULTILINGUAL DATA

Citation
C. Pearce et C. Nicholas, TELLTALE - EXPERIMENTS IN A DYNAMIC HYPERTEXT ENVIRONMENT FOR DEGRADED AND MULTILINGUAL DATA, Journal of the American Society for Information Science, 47(4), 1996, pp. 263-275
Citations number
27
Categorie Soggetti
Information Science & Library Science","Information Science & Library Science
ISSN journal
00028231
Volume
47
Issue
4
Year of publication
1996
Pages
263 - 275
Database
ISI
SICI code
0002-8231(1996)47:4<263:T-EIAD>2.0.ZU;2-1
Abstract
Methods and tools for finding documents relevant to a user's needs in document corpora can be found in the information retrieval, library sc ience, and hypertext communities. Typically, these systems provide ret rieval capabilities for fairly static corpora, their algorithms are de pendent on the language for which they are written, e.g. English, and they do not perform well when presented with misspelled words or text that has been degraded by OCR (optical character recognition) techniqu es. In this article, we present experimentation results for the TELLTA LE system. TELLTALE is a dynamic hypertext environment that provides f ull-text search from a hypertext-style user interface for text corpora that may be garbled by OCR or transmission errors, and that may conta in languages other than English. TELLTALE uses several techniques base d on n-grams (n character sequences of text). With these results we sh ow that the dynamic linkage mechanisms in TELLTALE are tolerant of gar bles in up to 30% of the characters in the body of the text.