S. Lainecruzel et al., IMPROVING INFORMATION-RETRIEVAL BY COMBINING USER PROFILE AND DOCUMENT SEGMENTATION, Information processing & management, 32(3), 1996, pp. 305-315
Citations number
18
Categorie Soggetti
Information Science & Library Science","Information Science & Library Science","Computer Science Information Systems
Due to the ever-increasing quantity of available information, which us
ers have to scan in order to find relevant items, noise has become a m
ajor issue in the implementation and use of information retrieval syst
ems. The aim of this study was to design an information retrieval syst
em permitting the ''personalization'' of search, by taking into accoun
t user profile. A pre-orientation system was first developed to give a
ccess to a personalized subcorpus. To limit noise in information retri
eval systems, the textual material offered to the user is reduced and
contains only those sections (units) of the document that interest him
and are significant to him (where textual material is used in the sen
se of document units to be processed by content analysis in order to b
uild descriptions of the documents). In this way, the documents are st
ructured on the basis of utility functions. The selected document unit
s are part of the sub-corpus defined by the pre-orientation system. Ne
xt, the profile of each user is characterized by determining competenc
e in a given field and at different levels. Each user is characterized
by: -stable information, related to the person rather than to a parti
cular search. This information provides a general description of the u
ser and his habits, -variable information, related to a specific searc
h. The priority here is to describe the objective of the search (searc
h may be either exhaustive or non-exhaustive; it may concern specializ
ed or popular publications, etc.). The function of the pre-orientation
system is to associate a set of characteristics applying to document
units to a given user profile. Search is then applied only to the subs
et of the selected document units that are relevant to the user and es
tablished following his profile. Document units are not characterized
on the basis of thematic criteria related to content, but rather on th
e basis of criteria relating to utility. The objective was to propose
a hypothesis on the different parameters determining user profile and
document unit characteristics, and to test such a hypothesis using an
existing information retrieval system incorporating full-text natural
language processing tools. (C) 1996 Elsevier Science Ltd