C. Pearce et C. Nicholas, TELLTALE - EXPERIMENTS IN A DYNAMIC HYPERTEXT ENVIRONMENT FOR DEGRADED AND MULTILINGUAL DATA, Journal of the American Society for Information Science, 47(4), 1996, pp. 263-275
Citations number
27
Categorie Soggetti
Information Science & Library Science","Information Science & Library Science
Methods and tools for finding documents relevant to a user's needs in
document corpora can be found in the information retrieval, library sc
ience, and hypertext communities. Typically, these systems provide ret
rieval capabilities for fairly static corpora, their algorithms are de
pendent on the language for which they are written, e.g. English, and
they do not perform well when presented with misspelled words or text
that has been degraded by OCR (optical character recognition) techniqu
es. In this article, we present experimentation results for the TELLTA
LE system. TELLTALE is a dynamic hypertext environment that provides f
ull-text search from a hypertext-style user interface for text corpora
that may be garbled by OCR or transmission errors, and that may conta
in languages other than English. TELLTALE uses several techniques base
d on n-grams (n character sequences of text). With these results we sh
ow that the dynamic linkage mechanisms in TELLTALE are tolerant of gar
bles in up to 30% of the characters in the body of the text.