ESTIMATING PROTEIN FUNCTION FROM COMBINATORIAL SEQUENCE DATA USING DECISION ALGORITHMS AND NEURAL NETWORKS

Citation
Er. Goldman et al., ESTIMATING PROTEIN FUNCTION FROM COMBINATORIAL SEQUENCE DATA USING DECISION ALGORITHMS AND NEURAL NETWORKS, Drug development research, 33(2), 1994, pp. 125-132
Citations number
30
Categorie Soggetti
Pharmacology & Pharmacy
Journal title
ISSN journal
02724391
Volume
33
Issue
2
Year of publication
1994
Pages
125 - 132
Database
ISI
SICI code
0272-4391(1994)33:2<125:EPFFCS>2.0.ZU;2-G
Abstract
Correlations between protein sequences and phenotypes were explored us ing databases of combinatorial cassette mutants of pigment-protein com plexes. Heuristically formulated decision algorithms and computer impl emented neural networks were compared to determine their accuracy in c lassification of phenotypic categories. For the databases examined, de cision algorithms employing very simple rules were able to properly cl assify mutants 80-84% of the time, based only on the amino acid sequen ce of the mutagenized region. Such decision algorithms did not require the formulation of any rules that involved site-to-site interactions, but rather, performed well based on the stringency of specific critic al sites in the protein that accept only a restricted set of amino aci ds. In some cases, neural networks scored almost 10% higher than decis ion algorithms on the same databases (i.e., 94%). However, the success of the primitive decision algorithms and perceptrons at sorting seque nces into categories suggests that linear effects predominate in the c lassification of a mutant's phenotype. Such methods should be generall y applicable to the broad spectrum of databases that are currently bei ng generated in combinatorial chemistry and biology experiments. (C) 1 994 Wiley-Liss, Inc.