Utilizing the multiple facets of WWW contents

Citation
Y. Kogan et al., Utilizing the multiple facets of WWW contents, DATA KN ENG, 28(3), 1998, pp. 255-275
Citations number
19
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
DATA & KNOWLEDGE ENGINEERING
ISSN journal
0169023X → ACNP
Volume
28
Issue
3
Year of publication
1998
Pages
255 - 275
Database
ISI
SICI code
0169-023X(199812)28:3<255:UTMFOW>2.0.ZU;2-I
Abstract
Current query languages for the Web (e.g., W3QL, WebLog and WebSQL) explore the structure of the Web. However, usually, the structure of the Web has l ittle to do with the semantics of the data. Therefore, it is practically di fficult to pose database queries over the Web. We introduce a new type of t ags for denoting the semantics of data stored in HTML pages. These semantic tags (implemented as HTML comments) superimpose on HTML pages semistructur ed objects in the style of the OEM model. The paper discusses two implement ed tools for fully utilizing the semantics. The first is a visualization to ol for displaying both the HTML reading of Web pages and the OEM reading of Web pages. The second tool is a query language, similar to LOREL, that can query the HTML structure and/or the OEM reading. The above formalism and t ools provide data-modeling capabilities for the Web that fit its heterogene ous nature. Real database queries, taking the OEM point of view, can be for mulated, including queries about the schema as well as queries about the HT ML structure of Web pages. Therefore, the query language is not restricted to portions of the Web in which semantic tags are used. (C) 1998 Elsevier S cience B.V. All rights reserved.