Current query languages for the Web (e.g., W3QL, WebLog and WebSQL) explore
the structure of the Web. However, usually, the structure of the Web has l
ittle to do with the semantics of the data. Therefore, it is practically di
fficult to pose database queries over the Web. We introduce a new type of t
ags for denoting the semantics of data stored in HTML pages. These semantic
tags (implemented as HTML comments) superimpose on HTML pages semistructur
ed objects in the style of the OEM model. The paper discusses two implement
ed tools for fully utilizing the semantics. The first is a visualization to
ol for displaying both the HTML reading of Web pages and the OEM reading of
Web pages. The second tool is a query language, similar to LOREL, that can
query the HTML structure and/or the OEM reading. The above formalism and t
ools provide data-modeling capabilities for the Web that fit its heterogene
ous nature. Real database queries, taking the OEM point of view, can be for
mulated, including queries about the schema as well as queries about the HT
ML structure of Web pages. Therefore, the query language is not restricted
to portions of the Web in which semantic tags are used. (C) 1998 Elsevier S
cience B.V. All rights reserved.