In data mining, we emphasize the need for learning from huge, incomplete, a
nd imperfect data sets. To handle noise in the problem domain, existing lea
rning systems avoid overfitting the imperfect training examples by excludin
g insignificant patterns. The problem is that these systems use a limiting
attribute-value language for representing the training examples and the ind
uced knowledge. Moreover, some important patterns are ignored because they
are statistically insignificant. In this article, we present a framework th
at combines Genetic Programming and Inductive Logic Programming to induce k
nowledge represented in various knowledge representation formalisms from no
isy databases. The framework is based on a formalism of logic grammars, and
it can specify the search space declaratively. An implementation of the fr
amework, LOGENPRO (The Logic grammar based GENetic PROgramming system), has
been developed. The performance of LOGENPRO is evaluated on the chess end-
game domain. We compare LOGENPRO with FOIL and other learning systems in de
tail, and find its performance is significantly better than that of the oth
ers, This result indicates that the Darwinian principle of natural selectio
n is a plausible noise handling method that can avoid overfitting and ident
ify important patterns at the same time. Moreover, the system is applied to
one real-life medical database. The knowledge discovered provides insights
to and allows better understanding of the medical domains.