GENE STRUCTURE PREDICTION USING INFORMATION ON HOMOLOGOUS PROTEIN-SEQUENCE

Citation
Ib. Rogozin et al., GENE STRUCTURE PREDICTION USING INFORMATION ON HOMOLOGOUS PROTEIN-SEQUENCE, Computer applications in the biosciences, 12(3), 1996, pp. 161-170
Citations number
40
Categorie Soggetti
Mathematical Methods, Biology & Medicine","Computer Sciences, Special Topics","Computer Science Interdisciplinary Applications","Biology Miscellaneous
ISSN journal
02667061
Volume
12
Issue
3
Year of publication
1996
Pages
161 - 170
Database
ISI
SICI code
0266-7061(1996)12:3<161:GSPUIO>2.0.ZU;2-E
Abstract
In this paper a new approach for the prediction of protein coding gene structures is described. The principal scheme of prediction is as fol lows: first, the exons with the best potential are predicted in a sequ ence with unknown functions and a list of potential amino acid fragmen ts coded by these exons is formed. Second testing the homology between each amino acid fragment from the list and proteins from the SWISS-PR OT database of amino acid sequences. One protein with the best homolog y is chosen out of all the homologous sequences. Third, reconstruction of the exon-intron structure, basing if on its homology, with the cho sen protein sequences. The method was tested on art independent contro l set (20 genes). The results were as follows: 21% of real exons were lost and 3% of non-real exons were found. This system can be used to r efine the results of gene prediction systems, especially if highly hom ologous proteins are found in the amino acid sequence database.