A SEGMENT-BASED DYNAMIC-PROGRAMMING ALGORITHM FOR PREDICTING GENE STRUCTURE

Authors
Citation
Td. Wu, A SEGMENT-BASED DYNAMIC-PROGRAMMING ALGORITHM FOR PREDICTING GENE STRUCTURE, Journal of computational biology, 3(3), 1996, pp. 375-394
Citations number
16
Categorie Soggetti
Biology,"Biochemical Research Methods",Mathematics
ISSN journal
10665277
Volume
3
Issue
3
Year of publication
1996
Pages
375 - 394
Database
ISI
SICI code
1066-5277(1996)3:3<375:ASDAFP>2.0.ZU;2-R
Abstract
An algorithm called segment-based dynamic programming is described for predicting gene structure from a sequence of genomic DNA, The algorit hm explores the space of gene structures that satisfy junctional and f rame constraints and finds the gene structure that optimizes the sum o f junctional and segmental scoring functions. Junctional constraints s pecify acceptable sites of initiation, termination, and splicing, wher eas frame constraints ensure that the total exon length is a multiple of three and that no in-frame stop codons occur within exons or at exo n-exon junctions, By computing over segments, segment-based dynamic pr ogramming differs from existing dynamic programming algorithms, which compute over individual nucleotides or over clusters of exon candidate s, Because segment-based dynamic programming maintains reading frame a nd phase information for each segment, it can assemble exons in-frame as well as score them in-frame, The algorithm is used to quantify the computational power of constraints, Experimental results show that fra me constraints reduce the size of the search space by several orders o f magnitude and that cardinality constraints place an asymptotic limit on the size of the search space, The algorithm is also used to compar e the accuracy of different methods for assembly and scoring, A scorin g scheme based on fifth-order Markov hexamer frequencies is presented and used in three objective functions, corresponding to in-frame, fram e-independent, and frame-maximal scoring strategies, Experimental resu lts show that in-frame assembly improves specificity only slightly ove r frame-independent assembly, whereas in-frame scoring improves specif icity substantially over frame-independent and frame-maximal scoring.