A CONCEPTUAL CLUSTERING-ALGORITHM FOR DATABASE SCHEMA DESIGN

Citation
Hw. Beck et al., A CONCEPTUAL CLUSTERING-ALGORITHM FOR DATABASE SCHEMA DESIGN, IEEE transactions on knowledge and data engineering, 6(3), 1994, pp. 396-411
Citations number
45
Categorie Soggetti
Information Science & Library Science","Computer Sciences, Special Topics","Engineering, Eletrical & Electronic","Computer Science Artificial Intelligence
ISSN journal
10414347
Volume
6
Issue
3
Year of publication
1994
Pages
396 - 411
Database
ISI
SICI code
1041-4347(1994)6:3<396:ACCFDS>2.0.ZU;2-L
Abstract
Conceptual clustering techniques based on current theories of categori zation provide a way to design database schemas that more accurately r epresent classes. An approach is presented in which classes are treate d as complex clusters of concepts rather than as simple predicates. An important service provided by the database is determining whether a p articular instance is a member of a class. A conceptual clustering alg orithm based on theories of categorization aids in building classes by grouping related instances and developing class descriptions. The res ulting database schema addresses a number of properties of categories, including default values and prototypes, analogical reasoning, except ion handling, and family resemblance. Class cohesion results from tryi ng to resolve conflicts between building generalized class description s and accommodating members of the class that deviate from these descr iptions. This is achieved by combining techniques from machine learnin g, specifically explanation-based learning and case-based reasoning. A subsumption function is used to compare two class descriptions. A rea lization function is used to determine whether an instance meets an ex isting class description. A new function, INTERSECT, is introduced to compare the similarity of two instances. INTERSECT is used in defining an exception condition. Exception handling results in schema modifica tion. This approach is applied to the database problems of schema inte gration, schema generation, query processing, and view creation.