RELATIONAL KNOWLEDGE DISCOVERY IN A CHINESE CHARACTER DATABASE

Citation
Jd. Zucker et al., RELATIONAL KNOWLEDGE DISCOVERY IN A CHINESE CHARACTER DATABASE, Applied artificial intelligence, 12(5), 1998, pp. 455-488
Citations number
69
Categorie Soggetti
Computer Science Artificial Intelligence","Computer Science Artificial Intelligence","Engineering, Eletrical & Electronic
ISSN journal
08839514
Volume
12
Issue
5
Year of publication
1998
Pages
455 - 488
Database
ISI
SICI code
0883-9514(1998)12:5<455:RKDIAC>2.0.ZU;2-N
Abstract
This article describes a novel application of Inductive Logic Programm ing (ILP) to the problem of data mining relational databases. The task addressed here consists in mining a relational database of more than 200,000 ground facts describing 6768 Chinese characters. Mining this r elational database may be recast in an ILP setting, where the form of the association rules searched are represented as nondeterminate Horn clauses, a type of clause known for being computationally hard to lear n. We have introduced a new kind of language bins, S-structural indete rminate clauses, which takes into account the meaning of part-of predi cates that play a keg, role in the complexity of learning in structura l domains. The ILP algorithm REPART has been specifically developed to learn S-structural indeterminate clauses. Its efficiency lies in a pa rticular change of representation, so as to enable one to use proposit ional learners. This article presents original results discover ed by REPART that exemplify how ILP algorithms may not only scale up efficie ntly to large relational databases but also discover useful and comput ationally hard to learn patterns.