PERFORMANCE-GUARANTEE GENE PREDICTIONS VIA SPLICED ALIGNMENT

Citation
Aa. Mironov et al., PERFORMANCE-GUARANTEE GENE PREDICTIONS VIA SPLICED ALIGNMENT, Genomics (San Diego, Calif.), 51(3), 1998, pp. 332-339
Citations number
19
Categorie Soggetti
Biothechnology & Applied Migrobiology","Genetics & Heredity
ISSN journal
08887543
Volume
51
Issue
3
Year of publication
1998
Pages
332 - 339
Database
ISI
SICI code
0888-7543(1998)51:3<332:PGPVSA>2.0.ZU;2-W
Abstract
An important and still unsolved problem in gene prediction is designin g an algorithm that not only predicts genes but estimates the quality of individual predictions as well. Since experimental biologists are i nterested mainly in the reliability of individual predictions (rather than in the average reliability of an algorithm) we attempted to devel op a gene recognition algorithm that guarantees a certain quality of p redictions. We demonstrate here that the similarity level with a relat ed protein is a reliable quality estimator for the spliced alignment a pproach to gene recognition. We also study the average performance of the spliced alignment algorithm for different targets on a complete se t of human genomic sequences with known relatives and demonstrate that the average performance of the method remains high even for very dist ant targets. Using plant, fungal, and prokaryotic target proteins for recognition of human genes leads to accurate predictions with 95, 93, and 91% correlation coefficient, respectively. For target proteins wit h similarity score above 60%, not only the average correlation coeffic ient is very high (97% and up) but also the quality of individual pred ictions is guaranteed to be at least 82%. It indicates that for this l evel of similarity the worst case performance of the spliced alignment algorithm is better than the average case performance of many statist ical gene recognition methods. (C) 1998 Academic Press