ITA
ENG

Enhancing concept-based retrieval based on minimal term sets

Authors

Alsaffar, AH Deogun, JS Raghavan, VV Sever, H

Citation

Ah. Alsaffar et al., Enhancing concept-based retrieval based on minimal term sets, J INTELL IN, 14(2-3), 2000, pp. 155-173

Citations number

Categorie Soggetti

Information Tecnology & Communication Systems

Journal title

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS

ISSN journal

09259902 → ACNP

Volume

Issue

2-3

Year of publication

2000

Pages

155 - 173

Database

ISI

SICI code

0925-9902(200003)14:2-3<155:ECRBOM>2.0.ZU;2-1

Abstract

There is considerable interest in bridging the terminological gap that exis ts between the way users prefer to specify their information needs and the way queries are expressed in terms of keywords or text expressions that occ ur in documents. One of the approaches proposed for bridging this gap is ba sed on technologies for expert systems. The central idea of such an approac h was introduced in the context of a system called Rule Based Information R etrieval by Computer (RUBRIC). In RUBRIC, user query topics (or concepts) a re captured in a rule base represented by an AND/OR tree. The evaluation of AND/OR tree is essentially based on minimum and maximum weights of query t erms for conjunctions and disjunctions, respectively. The time to generate the retrieval output of AND/OR tree for a given query topic is exponential in number of conjunctions in the DNF expression associated with the query t opic. In this paper, we propose a new approach for computing the retrieval output. The proposed approach involves preprocessing of the rule base to ge nerate Minimal Term Sets (MTSs) that speed up the retrieval process. The co mputational complexity of the on-line query evaluation following the prepro cessing is polynomial in m. We show that the computation and use of MTSs al lows a user to choose query topics that best suit their needs and to use re trieval functions that yield a more refined and controlled retrieval output than is possible with the AND/OR tree when document terms are binary. We i ncorporate p-Norm model into the process of evaluating MTSs to handle the c ase where weights of both documents and query terms are non-binary.