Sm. Hebsgaard et al., SPLICE-SITE PREDICTION IN ARABIDOPSIS-THALIANA PRE-MESSENGER-RNA BY COMBINING LOCAL AND GLOBAL SEQUENCE INFORMATION, Nucleic acids research, 24(17), 1996, pp. 3439-3452
Artificial neural networks have been combined with a rule based system
to predict intron splice sites in the dicot plant Arabidopsis thalian
a, A two step prediction scheme, where a global prediction of the codi
ng potential regulates a cutoff level for a local prediction of splice
sites, is refined by rules based on splice site confidence values, pr
ediction scores, coding context and distances between potential splice
sites. In this approach, the prediction of splice sites mutually affe
ct each other in a non-local manner, The combined approach drastically
reduces the large amount of false positive splice sites normally haun
ting splice site prediction, An analysis of the errors made by the net
works in the first step of the method revealed a previously unknown fe
ature, a frequent T-tract prolongation containing cryptic acceptor sit
es in the 5' end of exons, The method presented here has been compared
with three other approaches, GeneFinder, GeneMark and Grail, Overall
the method presented here is an order of magnitude better, We show tha
t the new method is able to find a donor site in the coding sequence f
or the jelly fish Green Fluorescent Protein, exactly at the position t
hat was experimentally observed in A.thaliana transformants, Predictio
ns for alternatively spliced genes are also presented, together with e
xamples of genes from other dicots, monocots and algae, The method has
been made available through electronic mail (NetPlantGene@cbs.dtu.dk)
, or the WWW at http://www.cbs.dtu.dk/NetPlantGene.html