ITA
ENG

FROM TEXT TO HYPERTEXT BY INDEXING

Authors

SALMINEN A TAGUESUTCLIFFE J MCCLELLAN C

Citation

A. Salminen et al., FROM TEXT TO HYPERTEXT BY INDEXING, ACM transactions on information systems, 13(1), 1995, pp. 69-99

Citations number

Categorie Soggetti

Information Science & Library Science","Computer Science Information Systems

Journal title

ACM transactions on information systems → ACNP

ISSN journal

10468188

Volume

Issue

Year of publication

1995

Pages

69 - 99

Database

ISI

SICI code

1046-8188(1995)13:1<69:FTTHBI>2.0.ZU;2-1

Abstract

A model is presented for converting a collection of documents to hyper text by means of indexing. The documents are assumed to be semistructu red, i.e., their text is a hierarchy of parts, and some of the parts c onsist of natural language. The model is intended as a framework for s pecifying hypertextual reading capabilities for specific application a reas and for developing new automated tools for the conversion of semi structured text to hypertext. In the model, two well-known paradigms-f ormal grammars and document indexing-are combined. The structure of th e source text is defined by a schema that is a constrained context-fre e grammar. The hierarchic structure of the source may thus be modeled by a parse tree for the grammar. The effect of indexing is described b y grammar transformations. The new grammar, called an indexing schema, is associated with a new parse tree where some text parts are index e lements. The indexing schema may hide some parts of the original docum ents or the structure of some parts. For information retrieval, parts of the indexed text are considered to be nodes of a hypergraph. In the hypergraph-based information access, the navigation capabilities of h ypertext systems are combined with the querying capabilities of inform ation retrieval systems.