M. Turcotte et al., The effect of relational background knowledge on learning of protein three-dimensional fold signatures, MACH LEARN, 43(1-2), 2001, pp. 81-95
As a form of Machine Learning the study of Inductive Logic Programming (ILP
) is motivated by a central belief: relational description languages are be
tter tin terms of accuracy and understandability) than propositional ones f
or certain real-world applications. This claim is investigated here for a p
articular application in structural molecular biology, that of constructing
readable descriptions of the major protein folds. To the authors' knowledg
e Machine Learning has not previously been applied systematically to this t
ask. In this application, the domain expert (third author) identified a nat
ural divide between essentially propositional features and more structurall
y-oriented relational ones. The following null hypotheses are tested: 1) fo
r a given ILP system (Progol) provision of relational background knowledge
does not increase predictive accuracy, 2) a good propositional learning sys
tem (C5.0) without relational background knowledge will outperform Progol w
ith relational background knowledge, 3) relational background knowledge doe
s not produce improved explanatory insight. Null hypotheses 1) and 2) are b
oth refuted on cross-validation results carried out over 20 of the most pop
ulated protein folds. Hypothesis 3 is refuted by demonstration of various i
nsightful rules discovered only in the relationally-oriented learned rules.