PREDICTING INTERNAL EXONS BY OLIGONUCLEOTIDE COMPOSITION AND DISCRIMINANT-ANALYSIS OF SPLICEABLE OPEN READING FRAMES

Citation
Vv. Solovyev et al., PREDICTING INTERNAL EXONS BY OLIGONUCLEOTIDE COMPOSITION AND DISCRIMINANT-ANALYSIS OF SPLICEABLE OPEN READING FRAMES, Nucleic acids research, 22(24), 1994, pp. 5156-5163
Citations number
28
Categorie Soggetti
Biology
Journal title
ISSN journal
03051048
Volume
22
Issue
24
Year of publication
1994
Pages
5156 - 5163
Database
ISI
SICI code
0305-1048(1994)22:24<5156:PIEBOC>2.0.ZU;2-I
Abstract
A new method which predicts internal exon sequences in human DNA has b een developed. The method is based on a splice site prediction algorit hm that uses the linear discriminant function to combine information a bout significant triplet frequencies of various functional parts of sp lice site regions and preferences of oligonucleotides in protein codin g and intron regions. The accuracy of our splice site recognition func tion is 97% for donor splice sites and 96% for acceptor splice sites. For exon prediction, we combine in a discriminant function the charact eristics describing the 5'-intron region, donor splice site, coding re gion, acceptor splice site and 3'-intron region for each open reading frame flanked by GT and AG base pairs. The accuracy of precise interna l exon recognition on a test set of 451 exon and 246693 pseudoexon seq uences is 77% with a specificity of 79%. The recognition quality compu ted at the level of individual nucleotides is 89% for exon sequences a nd 98% for intron sequences. This corresponds to a correlation coeffic ient for exon prediction of 0.87. The precision of this approach is be tter than other methods and has been tested on a larger data set. We h ave also developed a means for predicting exon - exon junctions in cDN A sequences, which can be useful for selecting optimal PCR primers.