As Extensible Markup Language (XML) is emerging as the data format of the I
nternet era, there are increasing needs to efficiently store and query XML
data. One path to this goal is transforming XML data into relational format
in order to use relational database technology. Although several transform
ation algorithms exist, they are incomplete in the sense that they focus on
ly on structural aspects and ignore semantic aspects. In this paper, we pre
sent the semantic knowledge that needs to be captured during transformation
to ensure a correct relational schema. Further, we show an algorithm that
can (1) derive such semantic knowledge from a given XML Document Type Defin
ition (DTD) and (2) preserve the knowledge by representing it as semantic c
onstraints in relational database terms, By combining existing transformati
on algorithms and our constraints-preserving algorithm, one can transform X
ML DTD, to relational schema where correct semantics and behaviors are guar
anteed by the preserved constraints. Experimental results are also presente
d. (C) 2001 Published by Elsevier Science B.V.