A MACHINE DISCOVERY FROM AMINO-ACID-SEQUENCES BY DECISION TREES OVER REGULAR PATTERNS

Citation
S. Arikawa et al., A MACHINE DISCOVERY FROM AMINO-ACID-SEQUENCES BY DECISION TREES OVER REGULAR PATTERNS, New generation computing, 11(3-4), 1993, pp. 361-375
Citations number
22
Categorie Soggetti
Computer Sciences","Computer Applications & Cybernetics
Journal title
ISSN journal
02883635
Volume
11
Issue
3-4
Year of publication
1993
Pages
361 - 375
Database
ISI
SICI code
0288-3635(1993)11:3-4<361:AMDFAB>2.0.ZU;2-4
Abstract
This paper describes a machine learning system that discovered a ''neg ative motif'', in transmembrane domain identification from amino acid sequences, and reports its experiments on protein data using PIR datab ase. We introduce a decision tree whose nodes are labeled with regular patterns. As a hypothesis, the system produces such a decision tree f or a small number of randomly chosen positive and negative examples fr om PIR. Experiments show that our system finds reasonable hypotheses v ery successfully. As a theoretical foundation, we show that the class of languages defined by decesion trees of depth at most d over k-varia ble regular patterns is polynomial-time learnable in the sense of prob ably approximately correct (PAC) learning for any fixed d, k greater-t han-or-equal-to 0.