The paper describes two soft induction techniques, GDT-NR and GDT-RS, for d
iscovering classification rules from databases with uncertainty and incompl
eteness. The techniques are based on a generalization distribution table (G
DT), in which the probabilistic relationships between concepts and instance
s over discrete domains are represented. By using the GDT as a probabilisti
c search space, (1) unseen instances can be considered in the rule discover
y process and the uncertainty of a rule, including its ability to predict u
nseen instances, can be explicitly represented in the strength of the rule;
(2) biases can be flexibly selected for search control and background know
ledge can be used as a bias to control the creation of a GDT and the rule d
iscovery process. We describe that a GDT can be represented by a variant of
connectionist networks (GDT-NR for short), and rules can be discovered by
learning on the GDT-NR. Furthermore, we combine the GDT with the rough set
methodology (GDT-RS for short). By using GDT-RS, a minimal set of rules wit
h larger strengths can be acquired from databases with noisy, incomplete da
ta. We compare GDT-NR with GDT-RS, and describe GDT-RS is a better way than
GDT-NR for large, complex databases. (C) 2001 Elsevier Science B.V. All ri
ghts reserved.