IMPROVING INFORMATION-RETRIEVAL BY COMBINING USER PROFILE AND DOCUMENT SEGMENTATION

Citation
S. Lainecruzel et al., IMPROVING INFORMATION-RETRIEVAL BY COMBINING USER PROFILE AND DOCUMENT SEGMENTATION, Information processing & management, 32(3), 1996, pp. 305-315
Citations number
18
Categorie Soggetti
Information Science & Library Science","Information Science & Library Science","Computer Science Information Systems
ISSN journal
03064573
Volume
32
Issue
3
Year of publication
1996
Pages
305 - 315
Database
ISI
SICI code
0306-4573(1996)32:3<305:IIBCUP>2.0.ZU;2-T
Abstract
Due to the ever-increasing quantity of available information, which us ers have to scan in order to find relevant items, noise has become a m ajor issue in the implementation and use of information retrieval syst ems. The aim of this study was to design an information retrieval syst em permitting the ''personalization'' of search, by taking into accoun t user profile. A pre-orientation system was first developed to give a ccess to a personalized subcorpus. To limit noise in information retri eval systems, the textual material offered to the user is reduced and contains only those sections (units) of the document that interest him and are significant to him (where textual material is used in the sen se of document units to be processed by content analysis in order to b uild descriptions of the documents). In this way, the documents are st ructured on the basis of utility functions. The selected document unit s are part of the sub-corpus defined by the pre-orientation system. Ne xt, the profile of each user is characterized by determining competenc e in a given field and at different levels. Each user is characterized by: -stable information, related to the person rather than to a parti cular search. This information provides a general description of the u ser and his habits, -variable information, related to a specific searc h. The priority here is to describe the objective of the search (searc h may be either exhaustive or non-exhaustive; it may concern specializ ed or popular publications, etc.). The function of the pre-orientation system is to associate a set of characteristics applying to document units to a given user profile. Search is then applied only to the subs et of the selected document units that are relevant to the user and es tablished following his profile. Document units are not characterized on the basis of thematic criteria related to content, but rather on th e basis of criteria relating to utility. The objective was to propose a hypothesis on the different parameters determining user profile and document unit characteristics, and to test such a hypothesis using an existing information retrieval system incorporating full-text natural language processing tools. (C) 1996 Elsevier Science Ltd