P. Salamon et al., ON THE ROBUSTNESS OF MAXIMUM-ENTROPY RELATIONSHIPS FOR COMPLEXITY DISTRIBUTIONS OF NUCLEOTIDE-SEQUENCES, Computers & chemistry, 17(2), 1993, pp. 135-148
Given a functionally equivalent set of natural nucleotide sequences, t
he distribution of local compositional complexity among all subsequenc
es of this set appears to be as random as possible consistent with the
mean complexity of such subsequences. The robustness of this relation
ship and its possible causes have been explored by means of (1) dynami
c simulations based on models of biased substitution mutations, (2) eq
uilibrium models incorporating known mononucleotide probabilities, and
(3) extension of the analyses, previously carried out on short oligon
ucleotides, to much larger subsequences. The maximum entropy effect ev
idently follows from almost any mechanism for substitution mutation dy
namics that incorporates a systematic bias toward low-complexity. The
effect is only partially explained by unequal mononucleotide probabili
ties. The complexity distributions for larger subsequences of length r
ange 40-120 nucleotides show novel regularity of structure ('featherin
g') that is not yet explained.