ON THE USE OF MACHINE LEARNING TO IDENTIFY TOPOLOGICAL RULES IN THE PACKING OF BETA-STRANDS

Citation
Rd. King et al., ON THE USE OF MACHINE LEARNING TO IDENTIFY TOPOLOGICAL RULES IN THE PACKING OF BETA-STRANDS, Protein engineering, 7(11), 1994, pp. 1295-1303
Citations number
29
Categorie Soggetti
Biology
Journal title
ISSN journal
02692139
Volume
7
Issue
11
Year of publication
1994
Pages
1295 - 1303
Database
ISI
SICI code
0269-2139(1994)7:11<1295:OTUOML>2.0.ZU;2-J
Abstract
The machine learning program GOLEM was applied to discover topological rules in the packing of beta-sheets in alpha/beta-domain proteins. Ru les (constraints) were determined for four features of beta-sheet pack ing: (i) whether a beta-strand is at an edge; (ii) whether two consecu tive beta-strands pack parallel or anti-parallel; (iii) whether two be ta-strands pack adjacently; and (iv) the winding direction of two cons ecutive beta-strands. Rules were found with high predictive accuracy a nd coverage. The errors were generally associated with complications i n domain folds, especially in one doubly wound domains. Investigation of the rules revealed interesting patterns, some of which were known p reviously, others that are novel. Novel features include (i) the relat ionship between pairs of sequential strands is in general one of decre asing size; (ii) more sequential pairs of strands wind in the directio n out than in; and (iii) it takes a larger alteration in hydrophobicit y to change a strand from winding in the direction out than in. These patterns in the data may be the result of folding pathways in the doma ins. The rules found are of predictive value and could be used in the combinatorial prediction of protein structure, or as a general test of model structures, e.g. those produced by threading. We conclude that machine learning has a useful role in the analysis of protein structur es.