P. Bernaola-galvan et al., Finding borders between coding and noncoding DNA regions by an entropic segmentation method, PHYS REV L, 85(6), 2000, pp. 1342-1345
We present a new computational approach to finding borders between coding a
nd noncoding DNA. This approach has two features: (i) DNA sequences are des
cribed by a 12-letter alphabet that captures the differential base composit
ion at each codon position, and (ii) the search for the borders is carried
out by means of an entropic;segmentation method which uses only the general
statistical properties of coding DNA. We find that this method is highly a
ccurate in finding borders between coding and noncoding regions and require
s no "prior training" on known data sets. Our results appear to be more acc
urate than those obtained with moving windows in the discrimination of codi
ng from noncoding DNA.