Natural language processing and information retrieval

Authors
Citation
Em. Voorhees, Natural language processing and information retrieval, LECT N A I, 1714, 1999, pp. 32-48
Citations number
28
Categorie Soggetti
Current Book Contents
ISSN journal
03029743
Volume
1714
Year of publication
1999
Pages
32 - 48
Database
ISI
SICI code
0302-9743(1999)1714:<32:NLPAIR>2.0.ZU;2-0
Abstract
Information retrieval addresses the problem of finding those documents whos e content matches a user's request from among a large collection of documen ts. Currently, the most successful general purpose retrieval methods are st atistical methods that treat text as little more than a bag of words. Howev er, attempts to improve retrieval performance through more sophisticated li nguistic processing have been largely unsuccessful. Indeed, unless done car efully, such processing can degrade retrieval effectiveness. Several factors contribute to the difficulty of improving on a good statist ical baseline including: the forgiving nature but broad coverage of the typ ical retrieval task; the lack of good weighting schemes for compound index terms; and the implicit linguistic processing inherent in the statistical m ethods. Natural language processing techniques may be more important for re lated tasks such as question answering or document summarization.