Beyond document similarity: Understanding value-based search and browsing technologies

Citation
A. Paepcke et al., Beyond document similarity: Understanding value-based search and browsing technologies, SIG RECORD, 29(1), 2000, pp. 80-92
Citations number
30
Categorie Soggetti
Computer Science & Engineering
Journal title
SIGMOD RECORD
ISSN journal
01635808 → ACNP
Volume
29
Issue
1
Year of publication
2000
Pages
80 - 92
Database
ISI
SICI code
0163-5808(200003)29:1<80:BDSUVS>2.0.ZU;2-A
Abstract
In the face of small, one or two word queries, high volumes of diverse docu ments on the Web are overwhelming search and ranking technologies that are based on document similarity measures. The increase of multimedia data with in documents sharply exacerbates the shortcomings of these approaches. Rece ntly, research prototypes and commercial experiments have added techniques that augment similarity-based search and ranking. These techniques rely on judgments about the 'value' of documents. Judgments are obtained directly f rom users, are derived by conjecture based on observations of user behavior , or are surmised from analyses of documents and collections. All these sys tems have been pursued independently, and no common understanding of the un derlying processes has been presented. We survey existing value-based appro aches, develop a reference architecture that helps compare the approaches, and categorize the constituent algorithms. We explain the options for colle cting value metadata, and for using that metadata to improve search, rankin g of results, and the enhancement of information browsing. Based on our sur vey and analysis, we then point to several open problems.