ITA
ENG

AN EXTENDED VECTOR-PROCESSING SCHEME FOR SEARCHING INFORMATION IN HYPERTEXT SYSTEMS

Authors

SAVOY J

Citation

J. Savoy, AN EXTENDED VECTOR-PROCESSING SCHEME FOR SEARCHING INFORMATION IN HYPERTEXT SYSTEMS, Information processing & management, 32(2), 1996, pp. 155-170

Citations number

Categorie Soggetti

Information Science & Library Science","Information Science & Library Science","Computer Science Information Systems

Journal title

Information processing & management → ACNP

ISSN journal

03064573

Volume

Issue

Year of publication

1996

Pages

155 - 170

Database

ISI

SICI code

0306-4573(1996)32:2<155:AEVSFS>2.0.ZU;2-C

Abstract

When searching information in a hypertext is limited to navigation, it is not an easy task, especially when the number of nodes and/or links becomes very large. A query-based access mechanism must be therefore provided to complement the navigational tools inherent in hypertext sy stems. Most mechanisms currently proposed are based on conventional in formation retrieval models which consider documents as independent ent ities, and ignore hypertext links. To promote the use of other informa tion retrieval mechanisms adapted to hypertext systems, this study att empts to respond to the following questions: (1) How can we integrate information given by hypertext links into an information retrieval sch eme? (2) Are these hypertext links (and link semantics) clues to the e nhancement of retrieval effectiveness? (3) If so, how can we use them? Two solutions are: (a) using a default weight function based on link type or assigning the same strength to all link types; or (b) using a specific weight for each particular link, i.e. the level of associatio n or a similarity measure. This study proposes an extended vector-proc essing scheme which extracts additional information from hypertext lin ks to enhance retrieval effectiveness. To carry out our investigations , we have built a hypertext based on two medium-size collections, the CACM and the CISI collection. The hypergraph is composed of explicit l inks (bibliographic references), computed links based on bibliographic information (bibliographic coupling, cocitation), or on hypertext lin ks established according to document representatives (nearest neighbor ).