WHAT IS THE TREE THAT WE SEE THROUGH THE WINDOW - A LINGUISTIC APPROACH TO WINDOWING AND TERM VARIATION

Authors
Citation
C. Jacquemin, WHAT IS THE TREE THAT WE SEE THROUGH THE WINDOW - A LINGUISTIC APPROACH TO WINDOWING AND TERM VARIATION, Information processing & management, 32(4), 1996, pp. 445-458
Citations number
32
Categorie Soggetti
Information Science & Library Science","Information Science & Library Science","Computer Science Information Systems
ISSN journal
03064573
Volume
32
Issue
4
Year of publication
1996
Pages
445 - 458
Database
ISI
SICI code
0306-4573(1996)32:4<445:WITTTW>2.0.ZU;2-N
Abstract
Windowing techniques play a key role in information retrieval. Previou s works have suggested that the quality of access to information relie s heavily on the characteristics of the windows. This study provides a linguistic approach to text windowing through an extraction of term v ariants with the help of a partial parser, The syntactic grounding of the method ensures that words observed within restricted spans are lex ically related and that spurious word cooccurrences are ruled out with a good level of confidence. The system, is computationally tractable on large corpora and large lists of terms. Illustrative examples of te rm variations form a large medical corpus are given. An experimental e valuation of the method shows that only a small proportion of co-occur ring words are lexically related and motivates the call for natural la nguage parsing techniques in text windowing. Copyright (C) 1996 Elsevi er Science Ltd