V. Brendel et J. Kleffe, PREDICTION OF LOCALLY OPTIMAL SPLICE SITES IN PLANT PRE-MESSENGER-RNAWITH APPLICATIONS TO GENE IDENTIFICATION IN ARABIDOPSIS-THALIANA GENOMIC DNA, Nucleic acids research, 26(20), 1998, pp. 4748-4757
Prediction of splice site selection and efficiency from sequence inspe
ction is of fundamental interest (testing the current knowledge of req
uisite sequence features) and practical importance (genome annotation,
design of mutant or transgenic organisms), In plants, the dominant va
riables affecting splice site selection and efficiency include the deg
ree of matching to the extended splice site consensus and the local gr
adient of U- and G+C-composition (introns being U-rich and exons G+C-r
ich), We present a novel method for splice site prediction, which was
particularly trained for maize and Arabidopsis thaliana, The method ex
tends our previous algorithm based on logitlinear models by considerin
g three variables simultaneously: intrinsic splice site strength, loca
l optimality and fit with respect to the overall splice pattern predic
tion. We show that the method considerably improves prediction specifi
city without compromising the high degree of sensitivity required in g
ene prediction algorithms. Applications to gene identification are ill
ustrated for Arabidopsis and suggest that successful methods must comb
ine scoring for splice sites, coding potential and similarity with pot
ential homologs in non-trivial ways. A WWW version of the SplicePredic
tor program is available at http:/gnomic.stanford.edu/ volker/SplicePr
edictor.html/.