After reviewing approaches to the nucleotide correlation of DNA sequen
ces the preferential mode analysis method is emphasized and discussed
in detail. The preferred modes and poor modes in coding regions, as we
ll as in introns, 5'-caps and 3'-tails are found through the statistic
al analysis of sequence data of all kinds of species in GenBank. The r
elation between the preferential mode analysis and informational param
eter method is deduced. It is discovered that in higher species the co
ding sequences preferentially use the strong-weak bond (strong bond =
C,G; weak bond = A,T) language and many noncoding regions (introns, 5'
-caps, 3'-tails) use purine-pyrimidine language. The application of di
fferent languages in coding and noncoding sequences is a result of evo
lution, and it may be related to the functional differences in these t
wo regions. Furthermore, we find that many preferential triplets in co
ding sequences can be expressed in a form of( W S) (W = A,T; S = C,G)
, which may be explained by its relation to t-RNA abundance. The syste
matic change of some mode contents with evolution has also been found.
(C) 1997 Academic Press Limited.