ITA
ENG

Beyond document similarity: Understanding value-based search and browsing technologies

Authors

Paepcke, A Garcia-Molina, H Rodriguez-Mula, G Cho, J

Citation

A. Paepcke et al., Beyond document similarity: Understanding value-based search and browsing technologies, SIG RECORD, 29(1), 2000, pp. 80-92

Citations number

Categorie Soggetti

Computer Science & Engineering

Journal title

SIGMOD RECORD

ISSN journal

01635808 → ACNP

Volume

Issue

Year of publication

2000

Pages

80 - 92

Database

ISI

SICI code

0163-5808(200003)29:1<80:BDSUVS>2.0.ZU;2-A

Abstract

In the face of small, one or two word queries, high volumes of diverse docu ments on the Web are overwhelming search and ranking technologies that are based on document similarity measures. The increase of multimedia data with in documents sharply exacerbates the shortcomings of these approaches. Rece ntly, research prototypes and commercial experiments have added techniques that augment similarity-based search and ranking. These techniques rely on judgments about the 'value' of documents. Judgments are obtained directly f rom users, are derived by conjecture based on observations of user behavior , or are surmised from analyses of documents and collections. All these sys tems have been pursued independently, and no common understanding of the un derlying processes has been presented. We survey existing value-based appro aches, develop a reference architecture that helps compare the approaches, and categorize the constituent algorithms. We explain the options for colle cting value metadata, and for using that metadata to improve search, rankin g of results, and the enhancement of information browsing. Based on our sur vey and analysis, we then point to several open problems.