We assessed the information content in nucleotide sequences through th
e efficiency of complete nucleotide sequence reconstruction from a set
of its fragments (frequency-correlation dictionary), using the increa
se in the reconstructed sequence entropy for frequency-correlation dic
tionaries of q-letter-long words as a measure of efficiency. Human gen
es have a maximum increase with q = 5, 6, and 7. We also revealed abno
rmally efficient reconstruction using dictionaries with q = 3 and 2, w
hich distinguishes the natural nucleotide sequences from random ones.