Database design for incomplete relations

Citation
M. Levene et G. Loizou, Database design for incomplete relations, ACM T DATAB, 24(1), 1999, pp. 80-126
Citations number
47
Categorie Soggetti
Computer Science & Engineering
Journal title
ACM TRANSACTIONS ON DATABASE SYSTEMS
ISSN journal
03625915 → ACNP
Volume
24
Issue
1
Year of publication
1999
Pages
80 - 126
Database
ISI
SICI code
0362-5915(199903)24:1<80:DDFIR>2.0.ZU;2-I
Abstract
Although there has been a vast amount of research in the area of relational database design, to our knowledge, there has been very little work that co nsiders whether this theory is still valid when relations in the database m ay be incomplete. When relations are incomplete and thus contain null value s the problem of whether satisfaction is additive arises. Additivity is the property of the equivalence of the satisfaction of a set of functional dep endencies (FDs) F with the individual satisfaction of each member of F in a n incomplete relation. It is well known that, in general, satisfaction of F Ds is not additive. Previously we have shown that satisfaction is additive if and only if the set of FDs is monodependent. We conclude that monodepend ence is a fundamental desirable property of a set of FDs when considering i ncomplete information in relational database design. We show that, when the set of FDs F either satisfies the intersection property or the split-freen ess property, then the problem of finding an optimum cover of F can be solv ed in polynomial time in the size of F; in general, this problem is known t o be NP-complete. We also show that when F satisfies the split-freeness pro perty then deciding whether there is a superkey of cardinality k or less ca n be solved in polynomial time in the size of F, since all the keys have th e same cardinality. If F only satisfies the intersection property then this problem is NP-complete, as in the general case. Moreover, we show that whe n F either satisfies the intersection property or the split-freeness proper ty then deciding whether an attribute is prime can be solved in polynomial time in the size of F; in general, this problem is known to be NP-complete. Assume that a relation schema R is in an appropriate normal form with resp ect to a set of FDs F. We show that when F satisfies the intersection prope rty then the notions of second normal form and third normal form are equiva lent. We also show that when R is in Boyce-Codd Normal Form (BCNF), then F is monodependent if and only if either there is a unique key for R, or for all keys X for R, the cardinality of X is one less than the number of attri butes associated with R. Finally, we tackle a long-standing problem in rela tional database theory by showing that when a set of FDs F over R satisfies the intersection property, it also satisfies the split-freeness property ( i.e., is monodependent), if and only if every lossless join decomposition o f R with respect to F is also dependency preserving. As a corollary of this result we are able to show that when F satisfies the intersection property , it also satisfies the split-freeness property (i.e., is monodependent), i f and only if every lossless join decomposition of R, which is in BCNF, is also dependency preserving. Our final result is that when F is monodependen t, then there exists a unique optimum lossless join decomposition of R, whi ch is in BCNF, and is also dependency preserving. Furthermore, this ultimat e decomposition can be attained in polynomial time in the size of F.