AN EXTENDED FUZZY LINGUISTIC APPROACH TO GENERALIZE BOOLEAN INFORMATION-RETRIEVAL

Citation
Dh. Kraft et al., AN EXTENDED FUZZY LINGUISTIC APPROACH TO GENERALIZE BOOLEAN INFORMATION-RETRIEVAL, Information sciences, applications, 2(3), 1994, pp. 119-134
Citations number
12
Categorie Soggetti
Information Science & Library Science","Computer Science Information Systems
ISSN journal
10690115
Volume
2
Issue
3
Year of publication
1994
Pages
119 - 134
Database
ISI
SICI code
1069-0115(1994)2:3<119:AEFLAT>2.0.ZU;2-Q
Abstract
The generalization of Boolean information retrieval systems is still o f interest to scholars. In spite of the fact that commercial systems u se Boolean retrieval mechanisms, such systems still have some limitati ons. One of the main problems is that such systems lack the ability to deal well with imprecision and subjectivity. Previous efforts have le d to the introduction of numeric weights to improve both document repr esentations (term weights) and query languages (query weights). Howeve r, the use of weights requires a clear knowledge of the semantics of t he query in order to translate a fuzzy concept into a precise numeric value. Moreover, it is difficult to model the matching of queries to d ocuments in a way that will preserve the semantics of user queries. A linguistic extension has been generated, starting from an existing Boo lean weighted retrieval model and formalized within fuzzy set theory, in which numeric query weights are replaced by linguistic descriptors that specify the degree of importance of the terms. In the past, query weights were seen as measures of the importance of a specific term in representing the query or as a threshold to aid in matching a specifi c document to the query. The linguistic extension was originally model ed to view the query weights as a description of the ideal document, s o that deviations would be rejected whether a given document had term weights that were too high or too low. This paper looks at an extensio n to the linguistic model that is not symmetric in that documents with a term weight below the query weight are treated differently than doc uments with a term weight above the query weight.